Monocular
View article: A geometric shape regularity effect in the human brain
A geometric shape regularity effect in the human brain Open
The perception and production of regular geometric shapes, a characteristic trait of human cultures since prehistory, has unknown neural mechanisms. Behavioral studies suggest that humans are attuned to discrete regularities such as symmet…
View article: UAVLight: A Benchmark for Illumination-Robust 3D Reconstruction in Unmanned Aerial Vehicle (UAV) Scenes
UAVLight: A Benchmark for Illumination-Robust 3D Reconstruction in Unmanned Aerial Vehicle (UAV) Scenes Open
Illumination inconsistency is a fundamental challenge in multi-view 3D reconstruction. Variations in sunlight direction, cloud cover, and shadows break the constant-lighting assumption underlying both classical multi-view stereo (MVS) and …
View article: RM3DMOT: Roadside Monocular 3D Multi-Object Tracking with Motion-Appearance Optimization
RM3DMOT: Roadside Monocular 3D Multi-Object Tracking with Motion-Appearance Optimization Open
View article: Safety Helmet-Based Scale Recovery for Low-Cost Monocular 3D Reconstruction on Construction Sites
Safety Helmet-Based Scale Recovery for Low-Cost Monocular 3D Reconstruction on Construction Sites Open
Three-dimensional (3D) reconstruction is increasingly being adopted in construction site management. While most existing studies rely on auxiliary equipment such as LiDAR and depth cameras, monocular depth estimation offers broader applica…
View article: Endo-G$^{2}$T: Geometry-Guided & Temporally Aware Time-Embedded 4DGS For Endoscopic Scenes
Endo-G$^{2}$T: Geometry-Guided & Temporally Aware Time-Embedded 4DGS For Endoscopic Scenes Open
Endoscopic (endo) video exhibits strong view-dependent effects such as specularities, wet reflections, and occlusions. Pure photometric supervision misaligns with geometry and triggers early geometric drift, where erroneous shapes are rein…
View article: Motion Marionette: Rethinking Rigid Motion Transfer via Prior Guidance
Motion Marionette: Rethinking Rigid Motion Transfer via Prior Guidance Open
We present Motion Marionette, a zero-shot framework for rigid motion transfer from monocular source videos to single-view target images. Previous works typically employ geometric, generative, or simulation priors to guide the transfer proc…
View article: STAvatar: Soft Binding and Temporal Density Control for Monocular 3D Head Avatars Reconstruction
STAvatar: Soft Binding and Temporal Density Control for Monocular 3D Head Avatars Reconstruction Open
Reconstructing high-fidelity and animatable 3D head avatars from monocular videos remains a challenging yet essential task. Existing methods based on 3D Gaussian Splatting typically bind Gaussians to mesh triangles and model deformations s…
View article: STAvatar: Soft Binding and Temporal Density Control for Monocular 3D Head Avatars Reconstruction
STAvatar: Soft Binding and Temporal Density Control for Monocular 3D Head Avatars Reconstruction Open
Reconstructing high-fidelity and animatable 3D head avatars from monocular videos remains a challenging yet essential task. Existing methods based on 3D Gaussian Splatting typically bind Gaussians to mesh triangles and model deformations s…
View article: Anisometropia in bilateral hyperopic refractive amblyopia requires eye patching
Anisometropia in bilateral hyperopic refractive amblyopia requires eye patching Open
View article: Uplifting Table Tennis: A Robust, Real-World Application for 3D Trajectory and Spin Estimation
Uplifting Table Tennis: A Robust, Real-World Application for 3D Trajectory and Spin Estimation Open
Obtaining the precise 3D motion of a table tennis ball from standard monocular videos is a challenging problem, as existing methods trained on synthetic data struggle to generalize to the noisy, imperfect ball and table detections of the r…
View article: Metric, inertially aligned monocular state estimation via kinetodynamic priors
Metric, inertially aligned monocular state estimation via kinetodynamic priors Open
Accurate state estimation for flexible robotic systems poses significant challenges, particular for platforms with dynamically deforming structures that invalidate rigid-body assumptions. This paper tackles this problem and allows to exten…
View article: MODEST: Multi-Optics Depth-of-Field Stereo Dataset
MODEST: Multi-Optics Depth-of-Field Stereo Dataset Open
Reliable depth estimation under real optical conditions remains a core challenge for camera vision in systems such as autonomous robotics and augmented reality. Despite recent progress in depth estimation and depth-of-field rendering, rese…
View article: Motion Marionette: Rethinking Rigid Motion Transfer via Prior Guidance
Motion Marionette: Rethinking Rigid Motion Transfer via Prior Guidance Open
We present Motion Marionette, a zero-shot framework for rigid motion transfer from monocular source videos to single-view target images. Previous works typically employ geometric, generative, or simulation priors to guide the transfer proc…
View article: DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination
DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination Open
Self-supervised monocular depth estimation serves as a key task in the development of endoscopic navigation systems. However, performance degradation persists due to uneven illumination inherent in endoscopic images, particularly in low-in…
View article: MODEST: Multi-Optics Depth-of-Field Stereo Dataset
MODEST: Multi-Optics Depth-of-Field Stereo Dataset Open
Reliable depth estimation under real optical conditions remains a core challenge for camera vision in systems such as autonomous robotics and augmented reality. Despite recent progress in depth estimation and depth-of-field rendering, rese…
View article: Metric, inertially aligned monocular state estimation via kinetodynamic priors
Metric, inertially aligned monocular state estimation via kinetodynamic priors Open
Accurate state estimation for flexible robotic systems poses significant challenges, particular for platforms with dynamically deforming structures that invalidate rigid-body assumptions. This paper tackles this problem and allows to exten…
View article: DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination
DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination Open
Self-supervised monocular depth estimation serves as a key task in the development of endoscopic navigation systems. However, performance degradation persists due to uneven illumination inherent in endoscopic images, particularly in low-in…
View article: Explaining Monocular Depth Estimation - Diving into Regional Differences in the Prediction
Explaining Monocular Depth Estimation - Diving into Regional Differences in the Prediction Open
Recent foundational depth estimation models achieve impressive accuracy on various scenes. However, due to their black box nature, we lack knowledge about how they utilize the input for their predictions, and hence about their applicabilit…
View article: Branch pruning alignment using small form factor monocular time-of-flight sensors -- Dataset
Branch pruning alignment using small form factor monocular time-of-flight sensors -- Dataset Open
View article: MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes
MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes Open
Recently, 3D Gaussian Splatting and its derivatives have achieved significant breakthroughs in large-scale scene reconstruction. However, how to efficiently and stably achieve high-quality geometric fidelity remains a core challenge. To ad…
View article: IDEAL-M3D: Instance Diversity-Enriched Active Learning for Monocular 3D Detection
IDEAL-M3D: Instance Diversity-Enriched Active Learning for Monocular 3D Detection Open
Monocular 3D detection relies on just a single camera and is therefore easy to deploy. Yet, achieving reliable 3D understanding from monocular images requires substantial annotation, and 3D labels are especially costly. To maximize perform…
View article: Effects of eye closure on the spiking activity of human lateral geniculate neurons
Effects of eye closure on the spiking activity of human lateral geniculate neurons Open
The lateral geniculate nucleus (LGN) of the thalamus is a key link between the retina and visual cortex but our understanding of the properties of neurons in the human LGN is based on recordings in animal models. Here we recorded spiking a…
View article: Multi-Agent Monocular Dense SLAM With 3D Reconstruction Priors
Multi-Agent Monocular Dense SLAM With 3D Reconstruction Priors Open
Monocular Simultaneous Localization and Mapping (SLAM) aims to estimate a robot's pose while simultaneously reconstructing an unknown 3D scene using a single camera. While existing monocular SLAM systems generate detailed 3D geometry throu…
View article: MonoMSK: Monocular 3D Musculoskeletal Dynamics Estimation
MonoMSK: Monocular 3D Musculoskeletal Dynamics Estimation Open
Reconstructing biomechanically realistic 3D human motion - recovering both kinematics (motion) and kinetics (forces) - is a critical challenge. While marker-based systems are lab-bound and slow, popular monocular methods use oversimplified…
View article: Branch pruning alignment using small form factor monocular time-of-flight sensors -- Dataset
Branch pruning alignment using small form factor monocular time-of-flight sensors -- Dataset Open
View article: MonoSR: Open-Vocabulary Spatial Reasoning from Monocular Images
MonoSR: Open-Vocabulary Spatial Reasoning from Monocular Images Open
Spatial reasoning (SR), the ability to infer 3D spatial information from 2D inputs, is essential for real-world applications such as embodied AI and autonomous driving. However, existing research primarily focuses on indoor environments an…
View article: LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space
LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space Open
Perception of Low-Altitude Aircraft (LAA) in 3D space enables precise 3D object localization and behavior understanding. However, datasets tailored for 3D LAA perception remain scarce. To address this gap, we present LAA3D, a large-scale d…
View article: DensifyBeforehand: LiDAR-assisted Content-aware Densification for Efficient and Quality 3D Gaussian Splatting
DensifyBeforehand: LiDAR-assisted Content-aware Densification for Efficient and Quality 3D Gaussian Splatting Open
This paper addresses the limitations of existing 3D Gaussian Splatting (3DGS) methods, particularly their reliance on adaptive density control, which can lead to floating artifacts and inefficient resource usage. We propose a novel densify…
View article: A CNN encoder for modal phase reconstruction in Adaptive Optics systems
A CNN encoder for modal phase reconstruction in Adaptive Optics systems Open
Pyramid wavefront sensing (pWFS) offers some of the best sensitivity for adaptive optics, but suffers from strong non-linearity. Modulation can extend linear range at the expense of sensitivity. Operating the pyramid without modulation is …
View article: LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space
LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space Open
Perception of Low-Altitude Aircraft (LAA) in 3D space enables precise 3D object localization and behavior understanding. However, datasets tailored for 3D LAA perception remain scarce. To address this gap, we present LAA3D, a large-scale d…