Sungmin Eum
YOU?
Author Swipe
View article: MoRe: Monocular Geometry Refinement via Graph Optimization for Cross-View Consistency
MoRe: Monocular Geometry Refinement via Graph Optimization for Cross-View Consistency Open
Monocular 3D foundation models offer an extensible solution for perception tasks, making them attractive for broader 3D vision applications. In this paper, we propose MoRe, a training-free Monocular Geometry Refinement method designed to i…
View article: UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting
UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting Open
Despite significant advancements in dynamic neural rendering, existing methods fail to address the unique challenges posed by UAV-captured scenarios, particularly those involving monocular camera setups, top-down perspective, and multiple …
View article: UAVTwin: Neural Digital Twins for UAVs using Gaussian Splatting
UAVTwin: Neural Digital Twins for UAVs using Gaussian Splatting Open
We present UAVTwin, a method for creating digital twins from real-world environments and facilitating data augmentation for training downstream models embedded in unmanned aerial vehicles (UAVs). Specifically, our approach focuses on synth…
View article: AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs
AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs Open
Composed pose retrieval (CPR) enables users to search for human poses by specifying a reference pose and a transition description, but progress in this field is hindered by the scarcity and inconsistency of annotated pose transitions. Exis…
View article: Negative Samples are at Large: Leveraging Hard-distance Elastic Loss for Re-identification
Negative Samples are at Large: Leveraging Hard-distance Elastic Loss for Re-identification Open
We present a Momentum Re-identification (MoReID) framework that can leverage a very large number of negative samples in training for general re-identification task. The design of this framework is inspired by Momentum Contrast (MoCo), whic…
View article: MA3: Model-Accuracy Aware Anytime Planning with Simulation Verification for Navigating Complex Terrains
MA3: Model-Accuracy Aware Anytime Planning with Simulation Verification for Navigating Complex Terrains Open
Off-road and unstructured environments often contain complex patches of various types of terrain, rough elevation changes, deformable objects, etc. An autonomous ground vehicle traversing such environments experiences physical interactions…
View article: Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification
Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification Open
A pretrain-finetune strategy is widely used to reduce the overfitting that can occur when data is insufficient for CNN training. First few layers of a CNN pretrained on a large-scale RGB dataset are capable of acquiring general image chara…
View article: Sketch-and-Fill Network for Semantic Segmentation
Sketch-and-Fill Network for Semantic Segmentation Open
Recent efforts in semantic segmentation using deep learning framework have made notable advances. While achieving high performance, however, they often require heavy computation, making them impractical to be used in real world application…
View article: S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition
S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition Open
We present a novel event recognition approach called Spatially-preserved Doubly-injected Object Detection CNN (S-DOD-CNN), which incorporates the spatially preserved object detection information in both a direct and an indirect way. Indire…
View article: ME R-CNN: Multi-Expert R-CNN for Object Detection
ME R-CNN: Multi-Expert R-CNN for Object Detection Open
We introduce Multi-Expert Region-based Convolutional Neural Network (ME R-CNN) which is equipped with multiple experts (ME) where each expert is learned to process a certain type of regions of interest (RoIs). This architecture better capt…
View article: Is Pretraining Necessary for hyperspectral image classification?
Is Pretraining Necessary for hyperspectral image classification? Open
We address two questions for training a convolutional neural network (CNN) for hyperspectral image classification: i) is it possible to build a pre-trained network? and ii) is the pre-training effective in furthering the performance? To an…
View article: Semantics to Space(S2S): Embedding semantics into spatial space for zero-shot verb-object query inferencing
Semantics to Space(S2S): Embedding semantics into spatial space for zero-shot verb-object query inferencing Open
We present a novel deep zero-shot learning (ZSL) model for inferencing human-object-interaction with verb-object (VO) query. While the previous two-stream ZSL approaches only use the semantic/textual information to be fed into the query st…
View article: DOD-CNN: Doubly-injecting Object Information for Event Recognition
DOD-CNN: Doubly-injecting Object Information for Event Recognition Open
Recognizing an event in an image can be enhanced by detecting relevant objects in two ways: 1) indirectly utilizing object detection information within the unified architecture or 2) directly making use of the object detection output resul…
View article: S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for\n Event Recognition
S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for\n Event Recognition Open
We present a novel event recognition approach called Spatially-preserved\nDoubly-injected Object Detection CNN (S-DOD-CNN), which incorporates the\nspatially preserved object detection information in both a direct and an\nindirect way. Ind…
View article: Cross-Domain CNN for Hyperspectral Image Classification
Cross-Domain CNN for Hyperspectral Image Classification Open
In this paper, we address the dataset scarcity issue with the hyperspectral image classification. As only a few thousands of pixels are available for training, it is difficult to effectively learn high-capacity Convolutional Neural Network…
View article: Object and Text-guided Semantics for CNN-based Activity Recognition
Object and Text-guided Semantics for CNN-based Activity Recognition Open
Many previous methods have demonstrated the importance of considering semantically relevant objects for carrying out video-based human activity recognition, yet none of the methods have harvested the power of large text corpora to relate t…
View article: Exploitation of Semantic Keywords for Malicious Event Classification
Exploitation of Semantic Keywords for Malicious Event Classification Open
Learning an event classifier is challenging when the scenes are semantically different but visually similar. However, as humans, we typically handle such tasks painlessly by adding our background semantic knowledge. Motivated by this obser…
View article: ME R-CNN: Multi-Expert Region-based CNN for Object Detection.
ME R-CNN: Multi-Expert Region-based CNN for Object Detection. Open
Recent CNN-based object detection methods have drastically improved their performances but still use a single classifier as opposed to in categorizing objects. The main motivation of introducing multi-experts is twofold: i) to allow diff…
View article: IOD-CNN: Integrating Object Detection Networks for Event Recognition
IOD-CNN: Integrating Object Detection Networks for Event Recognition Open
Many previous methods have showed the importance of considering semantically relevant objects for performing event recognition, yet none of the methods have exploited the power of deep convolutional neural networks to directly integrate re…
View article: Image and Video Analytics for Document Processing and Event Recognition
Image and Video Analytics for Document Processing and Event Recognition Open
The proliferation of handheld devices with cameras is among many changes in the past several decades which affected the document image analysis community by providing a far less constrained document imaging experience compared to tradition…