Yu‐Kun Lai
YOU?
Author Swipe
View article: Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement
Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement Open
View article: DDBot: Differentiable Physics-based Digging Robot for Unknown Granular Materials
DDBot: Differentiable Physics-based Digging Robot for Unknown Granular Materials Open
Automating the manipulation of granular materials poses significant challenges due to complex contact dynamics, unpredictable material properties, and intricate system states. Existing approaches often fail to achieve efficiency and accura…
View article: Differentiable Skill Optimisation for Powder Manipulation in Laboratory Automation
Differentiable Skill Optimisation for Powder Manipulation in Laboratory Automation Open
Robotic automation is accelerating scientific discovery by reducing manual effort in laboratory workflows. However, precise manipulation of powders remains challenging, particularly in tasks such as transport that demand accuracy and stabi…
View article: MirrorSAM2: Segment Mirror in Videos with Depth Perception
MirrorSAM2: Segment Mirror in Videos with Depth Perception Open
This paper presents MirrorSAM2, the first framework that adapts Segment Anything Model 2 (SAM2) to the task of RGB-D video mirror segmentation. MirrorSAM2 addresses key challenges in mirror detection, such as reflection ambiguity and textu…
View article: DyCrowd: Towards Dynamic Crowd Reconstruction from a Large-scene Video
DyCrowd: Towards Dynamic Crowd Reconstruction from a Large-scene Video Open
3D reconstruction of dynamic crowds in large scenes has become increasingly important for applications such as city surveillance and crowd analysis. However, current works attempt to reconstruct 3D crowds from a static image, causing a lac…
View article: FRNeRF: Fusion and Regularization Fields for Dynamic View Synthesis
FRNeRF: Fusion and Regularization Fields for Dynamic View Synthesis Open
Novel space-time view synthesis for monocular video is a highly challenging task: both static and dynamic objects usually appear in the video, but only a single view of the current scene is available, resulting in inaccurate synthesis resu…
View article: Navigating the tightrope:The art of balancing for better performance
Navigating the tightrope:The art of balancing for better performance Open
Effective supervisory mechanisms enable both monitoring and regulation of employee behavior, ensuring behavioral compliance and performance stability. However, continuous supervision can increase employees' psychological arousal, especiall…
View article: Differentiable physics-based system identification for robotic manipulation of elastoplastic materials
Differentiable physics-based system identification for robotic manipulation of elastoplastic materials Open
Robotic manipulation of volumetric elastoplastic deformable materials, from foods such as dough to construction materials like clay, is in its infancy, largely due to the difficulty of modelling and perception in a high-dimensional space. …
View article: NeRFFaceShop: Learning a Photo-Realistic 3D-Aware Generative Model of Animatable and Relightable Heads From Large-Scale in-the-Wild Videos
NeRFFaceShop: Learning a Photo-Realistic 3D-Aware Generative Model of Animatable and Relightable Heads From Large-Scale in-the-Wild Videos Open
Animatable and relightable 3D facial generation has fundamental applications in computer vision and graphics. Although animation and relighting are highly correlated, previous methods usually address them separately. Effectively combining …
View article: Skeletonization Quality Evaluation: Geometric Metrics for Point Cloud Analysis in Robotics
Skeletonization Quality Evaluation: Geometric Metrics for Point Cloud Analysis in Robotics Open
Skeletonization is a powerful tool for shape analysis, rooted in the inherent instinct to understand an object's morphology. It has found applications across various domains, including robotics. Although skeletonization algorithms have bee…
View article: The Double-Edged Effects of Work Task Stress on Safety Performance: A Cognitive Appraisal Perspective
The Double-Edged Effects of Work Task Stress on Safety Performance: A Cognitive Appraisal Perspective Open
View article: Temporal Inconsistency Guidance for Super-resolution Video Quality Assessment
Temporal Inconsistency Guidance for Super-resolution Video Quality Assessment Open
As super-resolution (SR) techniques introduce unique distortions that fundamentally differ from those caused by traditional degradation processes (e.g., compression), there is an increasing demand for specialized video quality assessment (…
View article: NeRF-Texture: Synthesizing Neural Radiance Field Textures
NeRF-Texture: Synthesizing Neural Radiance Field Textures Open
Texture synthesis is a fundamental problem in computer graphics that would\nbenefit various applications. Existing methods are effective in handling 2D\nimage textures. In contrast, many real-world textures contain meso-structure in\nthe 3…
View article: DualAvatar: Robust Gaussian Splatting Avatar with Dual Representation
DualAvatar: Robust Gaussian Splatting Avatar with Dual Representation Open
View article: Real-time Large-scale Deformation of Gaussian Splatting
Real-time Large-scale Deformation of Gaussian Splatting Open
Neural implicit representations, including Neural Distance Fields and Neural Radiance Fields, have demonstrated significant capabilities for reconstructing surfaces with complicated geometry and topology, and generating novel views of a sc…
View article: Real-time 3D Human Reconstruction and Rendering System from a Single RGB Camera
Real-time 3D Human Reconstruction and Rendering System from a Single RGB Camera Open
Transforming 2D human images into 3D appearance is essential for immersive communication. In this paper, we introduce a low-cost real-time 3D human reconstruction and rendering system with a single RGB camera at 28+ FPS, which guarantees b…
View article: Differentiable Physics-based System Identification for Robotic Manipulation of Elastoplastic Materials
Differentiable Physics-based System Identification for Robotic Manipulation of Elastoplastic Materials Open
Robotic manipulation of volumetric elastoplastic deformable materials, from foods such as dough to construction materials like clay, is in its infancy, largely due to the difficulty of modelling and perception in a high-dimensional space. …
View article: SceneExpander: Real-Time Scene Synthesis for Interactive Floor Plan Editing
SceneExpander: Real-Time Scene Synthesis for Interactive Floor Plan Editing Open
Scene synthesis has gained significant attention recently, and interactive scene synthesis focuses on yielding scenes according to user preferences. Existing literature either generates floor plans or scenes according to the floor plans. T…
View article: Real-time 3D-aware Portrait Video Relighting
Real-time 3D-aware Portrait Video Relighting Open
Synthesizing realistic videos of talking faces under custom lighting conditions and viewing angles benefits various downstream applications like video conferencing. However, most existing relighting methods are either time-consuming or una…
View article: AttentionPainter: An Efficient and Adaptive Stroke Predictor for Scene Painting
AttentionPainter: An Efficient and Adaptive Stroke Predictor for Scene Painting Open
Stroke-based Rendering (SBR) aims to decompose an input image into a sequence of parameterized strokes, which can be rendered into a painting that resembles the input image. Recently, Neural Painting methods that utilize deep learning and …
View article: FilterGNN: Image feature matching with cascaded outlier filters and linear attention
FilterGNN: Image feature matching with cascaded outlier filters and linear attention Open
The cross-view matching of local image features is a fundamental task in visual localization and 3D reconstruction. This study proposes FilterGNN, a transformer-based graph neural network (GNN), aiming to improve the matching efficiency an…
View article: HumanCoser: Layered 3D Human Generation via Semantic-Aware Diffusion Model
HumanCoser: Layered 3D Human Generation via Semantic-Aware Diffusion Model Open
This paper aims to generate physically-layered 3D humans from text prompts. Existing methods either generate 3D clothed humans as a whole or support only tight and simple clothing generation, which limits their applications to virtual try-…
View article: Two-stage deep neural network for diagnosing fungal keratitis via in vivo confocal microscopy images
Two-stage deep neural network for diagnosing fungal keratitis via in vivo confocal microscopy images Open
Timely and effective diagnosis of fungal keratitis (FK) is necessary for suitable treatment and avoiding irreversible vision loss for patients. In vivo confocal microscopy (IVCM) has been widely adopted to guide the FK diagnosis. We presen…
View article: Generating animatable 3D cartoon faces from single portraits
Generating animatable 3D cartoon faces from single portraits Open
View article: RecStitchNet: Learning to stitch images with rectangular boundaries
RecStitchNet: Learning to stitch images with rectangular boundaries Open
Irregular boundaries in image stitching naturally occur due to freely moving cameras. To deal with this problem, existing methods focus on optimizing mesh warping to make boundaries regular using the traditional explicit solution. However,…
View article: SketchDream: Sketch-based Text-To-3D Generation and Editing
SketchDream: Sketch-based Text-To-3D Generation and Editing Open
Existing text-based 3D generation methods generate attractive results but lack detailed geometry control. Sketches, known for their conciseness and expressiveness, have contributed to intuitive 3D modeling but are confined to producing tex…
View article: 4Dynamic: Text-to-4D Generation with Hybrid Priors
4Dynamic: Text-to-4D Generation with Hybrid Priors Open
Due to the fascinating generative performance of text-to-image diffusion models, growing text-to-3D generation works explore distilling the 2D generative priors into 3D, using the score distillation sampling (SDS) loss, to bypass the data …
View article: VRMM: A Volumetric Relightable Morphable Head Model
VRMM: A Volumetric Relightable Morphable Head Model Open
In this paper, we introduce the Volumetric Relightable Morphable Model (VRMM), a novel volumetric and parametric facial prior for 3D face modeling. While recent volumetric prior models offer improvements over traditional methods like 3D Mo…
View article: Fusion of Short-term and Long-term Attention for Video Mirror Detection
Fusion of Short-term and Long-term Attention for Video Mirror Detection Open
Techniques for detecting mirrors from static images have witnessed rapid growth in recent years. However, these methods detect mirrors from single input images. Detecting mirrors from video requires further consideration of temporal consis…
View article: SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis
SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis Open
SVG (Scalable Vector Graphics) is a widely used graphics format that possesses excellent scalability and editability. Image vectorization, which aims to convert raster images to SVGs, is an important yet challenging problem in computer vis…