Explanipedia

RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning Open

Deyi Ji, Yuekui Yang, Haiyang Wu, Shaoping Ma, Tianrun Chen , et al. · 2025

Advertisement (Ad) video violation detection is critical for ensuring platform compliance, but existing methods struggle with precise temporal grounding, noisy annotations, and limited generalization. We propose RAVEN, a novel framework th…

Managing solid waste to co-control carbon and nitrogen leakage in China Open

Xin Xu, Jingfang Zhan, Tianrun Chen, Xiuming Zhang, Yiyang Zou , et al. · 2025

From Air to Wear: Personalized 3D Digital Fashion with AR/VR Immersive 3D Sketching Open

Ying Zang, Yuanqi Hu, Xinyu Chen, Yun Xu, S. Wang , et al. · 2025

In the era of immersive consumer electronics, such as AR/VR headsets and smart devices, people increasingly seek ways to express their identity through virtual fashion. However, existing 3D garment design tools remain inaccessible to every…

CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images Open

Chen Cheng, Jiacheng Wei, Tianrun Chen, Chi Zhang, Xiaofeng Yang , et al. · 2025

Creating CAD digital twins from the physical world is crucial for manufacturing, design, and simulation. However, current methods typically rely on costly 3D scanning with labor-intensive post-processing. To provide a user-friendly design …

POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation Open

Lanyun Zhu, Tianrun Chen, Qianxiong Xu, Xuanyi Liu, Deyi Ji , et al. · 2025

Existing LVLM-based reasoning segmentation methods often suffer from imprecise segmentation results and hallucinations in their text responses. This paper introduces POPEN, a novel framework designed to address these issues and achieve imp…

Syllables to Scenes: Literary-Guided Free-Viewpoint 3D Scene Synthesis from Japanese Haiku Open

Chunan Yu, Yidong Han, Chaotao Ding, Ying Zang, Lanyun Zhu , et al. · 2025

In the era of the metaverse, where immersive technologies redefine human experiences, translating abstract literary concepts into navigable 3D environments presents a fundamental challenge in preserving semantic and emotional fidelity. Thi…

Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches Open

Ying Zang, Runlong Cao, Jianqi Zhang, Yidong Han, Ziyue Cao , et al. · 2025

Sketches, with their expressive potential, allow humans to convey the essence of an object through even a rough contour. For the first time, we harness this expressive potential to improve segmentation performance in challenging tasks like…

Not Every Patch is Needed: Towards a More Efficient and Effective Backbone for Video-based Person Re-identification Open

Lanyun Zhu, Tianrun Chen, Deyi Ji, Jieping Ye, Jun Liu · 2025

This paper proposes a new effective and efficient plug-and-play backbone for video-based person re-identification (ReID). Conventional video-based ReID methods typically use CNN or transformer backbones to extract deep features for every p…

RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning Open

Deyi Ji, Yuekui Yang, Liqun Liu, Peng Shu, Haiyang Wu , et al. · 2025

RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning Open

Deyi Ji, Yuekui Yang, Haiyang Wu, Shaoping Ma, Tianrun Chen , et al. · 2025

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion Open

Shengyuan Zhang, An Zhao, Ling Yang, Zejian Li, Chenye Meng , et al. · 2024

Diffusion models have been applied to 3D LiDAR scene completion due to their strong training stability and high completion quality. However, the slow sampling speed limits the practical application of diffusion-based scene completion model…

New Fashion: Personalized 3D Design with a Single Sketch Input Open

Tianrun Chen, Xinyu Chen, Chaotao Ding, Bai Ling, Shangzhan Zhang , et al. · 2024

Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry Open

Tianrun Chen, Chunan Yu, Yuanqi Hu, Jing Li, Tao Xu , et al. · 2024

In this paper, we propose Img2CAD, the first approach to our knowledge that uses 2D image inputs to generate CAD models with editable parameters. Unlike existing AI methods for 3D model generation using text or image inputs often rely on m…

SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More Open

Tianrun Chen, Ankang Lu, Lanyun Zhu, Chaotao Ding, Chunan Yu , et al. · 2024

The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its ad…

SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More Open

Tianrun Chen, Ankang Lu, Lanyun Zhu, Chaotao Ding, Chunan Yu , et al. · 2024

The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its ad…

SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More Open

Tianrun Chen, Ankang Lu, Lanyun Zhu, Chaotao Ding, Chunan Yu , et al. · 2024

The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its ad…

Magic3DSketch: Create Colorful 3D Models From Sketch-Based 3D Modeling Guided by Text and Language-Image Pre-Training Open

Ying Zang, Yidong Han, Chaotao Ding, Jianqi Zhang, Tianrun Chen · 2024

The requirement for 3D content is growing as AR/VR application emerges. At the same time, 3D modelling is only available for skillful experts, because traditional methods like Computer-Aided Design (CAD) are often too labor-intensive and s…

xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart Open

Tianrun Chen, Chaotao Ding, Lanyun Zhu, Tao Xu, Deyi Ji , et al. · 2024

Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) have been pivotal in biomedical image segmentation, yet their ability to manage long-range dependencies remains constrained by inherent locality and computational overhead.…

Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models Open

Tianrun Chen, Chunan Yu, Jing Li, Jianqi Zhang, Lanyun Zhu , et al. · 2024

In this paper, we introduce a new task: Zero-Shot 3D Reasoning Segmentation for parts searching and localization for objects, which is a new paradigm to 3D segmentation that transcends limitations for previous category-specific 3D semantic…

MaPa: Text-driven Photorealistic Material Painting for 3D Shapes Open

Shangzhan Zhang, Sida Peng, Tao Xu, Yuanbo Yang, Tianrun Chen , et al. · 2024

This paper aims to generate materials for 3D meshes from text descriptions. Unlike existing methods that synthesize texture maps, we propose to generate segment-wise procedural material graphs as the appearance representation, which suppor…

IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding Open

Lanyun Zhu, Deyi Ji, Tianrun Chen, Peng Xu, Jieping Ye , et al. · 2024

Despite achieving rapid developments and with widespread applications, Large Vision-Language Models (LVLMs) confront a serious challenge of being prone to generating hallucinations. An over-reliance on linguistic priors has been identified…

RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner Open

Ying Zang, Chenglong Fu, Runlong Cao, Didi Zhu, Min Zhang , et al. · 2024

Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate underst…

Deep3DSketch: 3D modeling from Free-hand Sketches with View- and Structural-Aware Adversarial Training Open

Tianrun Chen, Chenglong Fu, Lanyun Zhu, Papa Mao, Jia Zhang , et al. · 2023

This work aims to investigate the problem of 3D modeling using single free-hand sketches, which is one of the most natural ways we humans express ideas. Although sketch-based 3D modeling can drastically make the 3D modeling process more ac…

LLaFS: When Large Language Models Meet Few-Shot Segmentation Open

Lanyun Zhu, Tianrun Chen, Deyi Ji, Jieping Ye, Jun Liu · 2023

This paper proposes LLaFS, the first attempt to leverage large language models (LLMs) in few-shot segmentation. In contrast to the conventional few-shot segmentation methods that only rely on the limited and biased information from the ann…

Deep3DSketch+: Obtaining Customized 3D Model by Single Free-Hand Sketch through Deep Learning Open

Ying Zang, Chenglong Fu, Tianrun Chen, Yuanqi Hu, Qingshan Liu , et al. · 2023

As 3D models become critical in today's manufacturing and product design, conventional 3D modeling approaches based on Computer-Aided Design (CAD) are labor-intensive, time-consuming, and have high demands on the creators. This work aims t…

Reality3DSketch: Rapid 3D Modeling of Objects from Single Freehand Sketches Open

Tianrun Chen, Chaotao Ding, Lanyun Zhu, Ying Zang, Yiyi Liao , et al. · 2023

The emerging trend of AR/VR places great demands on 3D content. However, most existing software requires expertise and is difficult for novice users to use. In this paper, we aim to create sketch-based modeling tools for user-friendly 3D m…

Deep3DSketch+\+: High-Fidelity 3D Modeling from Single Free-hand Sketches Open

Ying Zang, Chaotao Ding, Tianrun Chen, Papa Mao, Wenjun Hu · 2023

The rise of AR/VR has led to an increased demand for 3D content. However, the traditional method of creating 3D content using Computer-Aided Design (CAD) is a labor-intensive and skill-demanding process, making it difficult to use for novi…

Deep3DSketch+: Rapid 3D Modeling from Single Free-hand Sketches Open

Tianrun Chen, Chenglong Fu, Ying Zang, Lanyun Zhu, Jia Zhang , et al. · 2023

The rapid development of AR/VR brings tremendous demands for 3D content. While the widely-used Computer-Aided Design (CAD) method requires a time-consuming and labor-intensive modeling process, sketch-based 3D modeling offers a potential s…

PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes Open

Xiao Fu, Shangzhan Zhang, Tianrun Chen, Yichong Lu, Xiaowei Zhou , et al. · 2023

Training perception systems for self-driving cars requires substantial 2D annotations that are labor-intensive to manual label. While existing datasets provide rich annotations on pre-recorded sequences, they fall short in labeling rarely …

Spatial multiplexing for robust optical vortex transmission with optical nonlinearity Open

Weiru Fan, Tianrun Chen, Xiaobin Tang, Xingqi Xu, Luqi Yuan , et al. · 2023

Optical vortex beams, with phase singularity characterized by a topological charge (TC), introduces a new dimension for optical communication, quantum information, and optical light manipulation. However, the evaluation of TCs after beam p…

Tianrun Chen YOU? Author Swipe