Tianrun Chen
YOU?
Author Swipe
View article: RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning
RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning Open
Advertisement (Ad) video violation detection is critical for ensuring platform compliance, but existing methods struggle with precise temporal grounding, noisy annotations, and limited generalization. We propose RAVEN, a novel framework th…
View article: Managing solid waste to co-control carbon and nitrogen leakage in China
Managing solid waste to co-control carbon and nitrogen leakage in China Open
View article: From Air to Wear: Personalized 3D Digital Fashion with AR/VR Immersive 3D Sketching
From Air to Wear: Personalized 3D Digital Fashion with AR/VR Immersive 3D Sketching Open
In the era of immersive consumer electronics, such as AR/VR headsets and smart devices, people increasingly seek ways to express their identity through virtual fashion. However, existing 3D garment design tools remain inaccessible to every…
View article: CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images Open
Creating CAD digital twins from the physical world is crucial for manufacturing, design, and simulation. However, current methods typically rely on costly 3D scanning with labor-intensive post-processing. To provide a user-friendly design …
View article: POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation
POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation Open
Existing LVLM-based reasoning segmentation methods often suffer from imprecise segmentation results and hallucinations in their text responses. This paper introduces POPEN, a novel framework designed to address these issues and achieve imp…
View article: Syllables to Scenes: Literary-Guided Free-Viewpoint 3D Scene Synthesis from Japanese Haiku
Syllables to Scenes: Literary-Guided Free-Viewpoint 3D Scene Synthesis from Japanese Haiku Open
In the era of the metaverse, where immersive technologies redefine human experiences, translating abstract literary concepts into navigable 3D environments presents a fundamental challenge in preserving semantic and emotional fidelity. Thi…
View article: Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches
Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches Open
Sketches, with their expressive potential, allow humans to convey the essence of an object through even a rough contour. For the first time, we harness this expressive potential to improve segmentation performance in challenging tasks like…
View article: Not Every Patch is Needed: Towards a More Efficient and Effective Backbone for Video-based Person Re-identification
Not Every Patch is Needed: Towards a More Efficient and Effective Backbone for Video-based Person Re-identification Open
This paper proposes a new effective and efficient plug-and-play backbone for video-based person re-identification (ReID). Conventional video-based ReID methods typically use CNN or transformer backbones to extract deep features for every p…
View article: RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning
RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning Open
View article: RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning
RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning Open
View article: Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion Open
Diffusion models have been applied to 3D LiDAR scene completion due to their strong training stability and high completion quality. However, the slow sampling speed limits the practical application of diffusion-based scene completion model…
View article: New Fashion: Personalized 3D Design with a Single Sketch Input
New Fashion: Personalized 3D Design with a Single Sketch Input Open
View article: Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry
Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry Open
In this paper, we propose Img2CAD, the first approach to our knowledge that uses 2D image inputs to generate CAD models with editable parameters. Unlike existing AI methods for 3D model generation using text or image inputs often rely on m…
View article: SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More Open
The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its ad…
View article: SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More Open
The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its ad…
View article: SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More Open
The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its ad…
View article: Magic3DSketch: Create Colorful 3D Models From Sketch-Based 3D Modeling Guided by Text and Language-Image Pre-Training
Magic3DSketch: Create Colorful 3D Models From Sketch-Based 3D Modeling Guided by Text and Language-Image Pre-Training Open
The requirement for 3D content is growing as AR/VR application emerges. At the same time, 3D modelling is only available for skillful experts, because traditional methods like Computer-Aided Design (CAD) are often too labor-intensive and s…
View article: xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart
xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart Open
Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) have been pivotal in biomedical image segmentation, yet their ability to manage long-range dependencies remains constrained by inherent locality and computational overhead.…
View article: Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models Open
In this paper, we introduce a new task: Zero-Shot 3D Reasoning Segmentation for parts searching and localization for objects, which is a new paradigm to 3D segmentation that transcends limitations for previous category-specific 3D semantic…
View article: MaPa: Text-driven Photorealistic Material Painting for 3D Shapes
MaPa: Text-driven Photorealistic Material Painting for 3D Shapes Open
This paper aims to generate materials for 3D meshes from text descriptions. Unlike existing methods that synthesize texture maps, we propose to generate segment-wise procedural material graphs as the appearance representation, which suppor…
View article: IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding Open
Despite achieving rapid developments and with widespread applications, Large Vision-Language Models (LVLMs) confront a serious challenge of being prone to generating hallucinations. An over-reliance on linguistic priors has been identified…
View article: RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner
RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner Open
Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate underst…
View article: Deep3DSketch: 3D modeling from Free-hand Sketches with View- and Structural-Aware Adversarial Training
Deep3DSketch: 3D modeling from Free-hand Sketches with View- and Structural-Aware Adversarial Training Open
This work aims to investigate the problem of 3D modeling using single free-hand sketches, which is one of the most natural ways we humans express ideas. Although sketch-based 3D modeling can drastically make the 3D modeling process more ac…
View article: LLaFS: When Large Language Models Meet Few-Shot Segmentation
LLaFS: When Large Language Models Meet Few-Shot Segmentation Open
This paper proposes LLaFS, the first attempt to leverage large language models (LLMs) in few-shot segmentation. In contrast to the conventional few-shot segmentation methods that only rely on the limited and biased information from the ann…
View article: Deep3DSketch+: Obtaining Customized 3D Model by Single Free-Hand Sketch through Deep Learning
Deep3DSketch+: Obtaining Customized 3D Model by Single Free-Hand Sketch through Deep Learning Open
As 3D models become critical in today's manufacturing and product design, conventional 3D modeling approaches based on Computer-Aided Design (CAD) are labor-intensive, time-consuming, and have high demands on the creators. This work aims t…
View article: Reality3DSketch: Rapid 3D Modeling of Objects from Single Freehand Sketches
Reality3DSketch: Rapid 3D Modeling of Objects from Single Freehand Sketches Open
The emerging trend of AR/VR places great demands on 3D content. However, most existing software requires expertise and is difficult for novice users to use. In this paper, we aim to create sketch-based modeling tools for user-friendly 3D m…
View article: Deep3DSketch+\+: High-Fidelity 3D Modeling from Single Free-hand Sketches
Deep3DSketch+\+: High-Fidelity 3D Modeling from Single Free-hand Sketches Open
The rise of AR/VR has led to an increased demand for 3D content. However, the traditional method of creating 3D content using Computer-Aided Design (CAD) is a labor-intensive and skill-demanding process, making it difficult to use for novi…
View article: Deep3DSketch+: Rapid 3D Modeling from Single Free-hand Sketches
Deep3DSketch+: Rapid 3D Modeling from Single Free-hand Sketches Open
The rapid development of AR/VR brings tremendous demands for 3D content. While the widely-used Computer-Aided Design (CAD) method requires a time-consuming and labor-intensive modeling process, sketch-based 3D modeling offers a potential s…
View article: PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes
PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes Open
Training perception systems for self-driving cars requires substantial 2D annotations that are labor-intensive to manual label. While existing datasets provide rich annotations on pre-recorded sequences, they fall short in labeling rarely …
View article: Spatial multiplexing for robust optical vortex transmission with optical nonlinearity
Spatial multiplexing for robust optical vortex transmission with optical nonlinearity Open
Optical vortex beams, with phase singularity characterized by a topological charge (TC), introduces a new dimension for optical communication, quantum information, and optical light manipulation. However, the evaluation of TCs after beam p…