Suya You
YOU?
Author Swipe
View article: Descrip3D: Enhancing Large Language Model-based 3D Scene Understanding with Object-Level Text Descriptions
Descrip3D: Enhancing Large Language Model-based 3D Scene Understanding with Object-Level Text Descriptions Open
Understanding 3D scenes goes beyond simply recognizing objects; it requires reasoning about the spatial and semantic relationships between them. Current 3D scene-language models often struggle with this relational understanding, particular…
View article: AllTracker: Efficient Dense Point Tracking at High Resolution
AllTracker: Efficient Dense Point Tracking at High Resolution Open
We introduce AllTracker: a model that estimates long-range point tracks by way of estimating the flow field between a query frame and every other frame of a video. Unlike existing point tracking methods, our approach delivers high-resoluti…
View article: Green Video Camouflaged Object Detection
Green Video Camouflaged Object Detection Open
Camouflaged object detection (COD) aims to distinguish hidden objects embedded in an environment highly similar to the object. Conventional video-based COD (VCOD) methods explicitly extract motion cues or employ complex deep learning netwo…
View article: Temporally Consistent Dynamic Scene Graphs: An End-to-End Approach for Action Tracklet Generation
Temporally Consistent Dynamic Scene Graphs: An End-to-End Approach for Action Tracklet Generation Open
Understanding video content is pivotal for advancing real-world applications like activity recognition, autonomous systems, and human-computer interaction. While scene graphs are adept at capturing spatial relationships between objects in …
View article: GreenCOD: A Green Camouflaged Object Detection Method
GreenCOD: A Green Camouflaged Object Detection Method Open
We introduce GreenCOD, a green method for detecting camouflaged ob jects distinct in its avoidance of backpropagation techniques. GreenCOD leverages gradient boosting and deep features extracted from pre-trained Deep Neural Networks. Tradi…
View article: An Aerial Photogrammetry Benchmark Dataset for Point Cloud Segmentation and Style Translation
An Aerial Photogrammetry Benchmark Dataset for Point Cloud Segmentation and Style Translation Open
The recent surge in diverse 3D datasets spanning various scales and applications marks a significant advancement in the field. However, the comprehensive process of data acquisition, refinement, and annotation at a large scale poses a form…
View article: Unsupervised Green Object Tracker (GOT) without Offline Pre-training
Unsupervised Green Object Tracker (GOT) without Offline Pre-training Open
Supervised trackers trained on labeled data dominate the single object tracking field for superior tracking accuracy. The labeling cost and the huge computational complexity hinder their applications on edge devices. Unsupervised learning …
View article: Efficient Human-Object-Interaction (EHOI) Detection via Interaction Label Coding and Conditional Decision
Efficient Human-Object-Interaction (EHOI) Detection via Interaction Label Coding and Conditional Decision Open
Human-Object Interaction (HOI) detection is a fundamental task in image understanding. While deep-learning-based HOI methods provide high performance in terms of mean Average Precision (mAP), they are computationally expensive and opaque i…
View article: GreenCOD: A Green Camouflaged Object Detection Method
GreenCOD: A Green Camouflaged Object Detection Method Open
We introduce GreenCOD, a green method for detecting camouflaged objects, distinct in its avoidance of backpropagation techniques. GreenCOD leverages gradient boosting and deep features extracted from pre-trained Deep Neural Networks (DNNs)…
View article: DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting Open
The increasing demand for virtual reality applications has highlighted the significance of crafting immersive 3D assets. We present a text-to-3D 360$^{\circ}$ scene generation pipeline that facilitates the creation of comprehensive 360$^{\…
View article: Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields
Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields Open
3D scene representations have gained immense popularity in recent years. Methods that use Neural Radiance fields are versatile for traditional tasks such as novel view synthesis. In recent times, some work has emerged that aims to extend t…
View article: PrObeD: Proactive Object Detection Wrapper
PrObeD: Proactive Object Detection Wrapper Open
Previous research in $2D$ object detection focuses on various tasks, including detecting objects in generic and camouflaged images. These works are regarded as passive works for object detection as they take the input image as is. However,…
View article: BLoad: Enhancing Neural Network Training with Efficient Sequential Data Handling
BLoad: Enhancing Neural Network Training with Efficient Sequential Data Handling Open
The increasing complexity of modern deep neural network models and the expanding sizes of datasets necessitate the development of optimized and scalable training methods. In this white paper, we addressed the challenge of efficiently train…
View article: SemST: Semantically Consistent Multi-Scale Image Translation via Structure-Texture Alignment
SemST: Semantically Consistent Multi-Scale Image Translation via Structure-Texture Alignment Open
Unsupervised image-to-image (I2I) translation learns cross-domain image mapping that transfers input from the source domain to output in the target domain while preserving its semantics. One challenge is that different semantic statistics …
View article: Unsupervised Green Object Tracker (GOT) without Offline Pre-training
Unsupervised Green Object Tracker (GOT) without Offline Pre-training Open
Supervised trackers trained on labeled data dominate the single object tracking field for superior tracking accuracy. The labeling cost and the huge computational complexity hinder their applications on edge devices. Unsupervised learning …
View article: SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets
SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets Open
Scene understanding using multi-modal data is necessary in many applications, e.g., autonomous navigation. To achieve this in a variety of situations, existing models must be able to adapt to shifting data distributions without arduous dat…
View article: Lightweight Quality Evaluation of Generated Samples and Generative Models
Lightweight Quality Evaluation of Generated Samples and Generative Models Open
Although there are metrics to evaluate the performance of generative models, little research is conducted on the quality evaluation of individual generated samples. A lightweight generated sample quality evaluation (LGSQE) method is propos…
View article: An Online Continuous Semantic Segmentation Framework With Minimal Labeling Efforts
An Online Continuous Semantic Segmentation Framework With Minimal Labeling Efforts Open
The annotation load for a new dataset has been greatly decreased using domain adaptation based semantic segmentation, which iteratively constructs pseudo labels on unlabeled target data and retrains the network. However, realistic segmenta…
View article: Unsupervised Synthetic Image Refinement via Contrastive Learning and Consistent Semantic-Structural Constraints
Unsupervised Synthetic Image Refinement via Contrastive Learning and Consistent Semantic-Structural Constraints Open
Ensuring the realism of computer-generated synthetic images is crucial to deep neural network (DNN) training. Due to different semantic distributions between synthetic and real-world captured datasets, there exists semantic mismatch betwee…
View article: A Study on Improving Realism of Synthetic Data for Machine Learning
A Study on Improving Realism of Synthetic Data for Machine Learning Open
Synthetic-to-real data translation using generative adversarial learning has achieved significant success in improving synthetic data. Yet, limited studies focus on deep evaluation and comparison of adversarial training on general-purpose …
View article: TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point Cloud Semantic Segmentation
TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point Cloud Semantic Segmentation Open
Common image-based LiDAR point cloud semantic segmentation (LiDAR PCSS) approaches have bottlenecks resulting from the boundary-blurring problem of convolution neural networks (CNNs) and quantitation loss of spherical projection. In this w…
View article: Frequency-domain Learning for Volumetric-based 3D Data Perception
Frequency-domain Learning for Volumetric-based 3D Data Perception Open
Frequency-domain learning draws attention due to its superior tradeoff between inference accuracy and input data size. Frequency-domain learning in 2D computer vision tasks has shown that 2D convolutional neural networks (CNN) have a stati…
View article: DDS: Decoupled Dynamic Scene-Graph Generation Network
DDS: Decoupled Dynamic Scene-Graph Generation Network Open
Scene-graph generation involves creating a structural representation of the relationships between objects in a scene by predicting subject-object-relation triplets from input data. Existing methods show poor performance in detecting triple…
View article: ALTO: Alternating Latent Topologies for Implicit 3D Reconstruction
ALTO: Alternating Latent Topologies for Implicit 3D Reconstruction Open
This work introduces alternating latent topologies (ALTO) for high-fidelity reconstruction of implicit 3D surfaces from noisy point clouds. Previous work identifies that the spatial arrangement of latent encodings is important to recover d…
View article: Enhanced Low-resolution LiDAR-Camera Calibration Via Depth Interpolation and Supervised Contrastive Learning
Enhanced Low-resolution LiDAR-Camera Calibration Via Depth Interpolation and Supervised Contrastive Learning Open
Motivated by the increasing application of low-resolution LiDAR recently, we target the problem of low-resolution LiDAR-camera calibration in this work. The main challenges are two-fold: sparsity and noise in point clouds. To address the p…
View article: LGSQE: Lightweight Generated Sample Quality Evaluatoin
LGSQE: Lightweight Generated Sample Quality Evaluatoin Open
Despite prolific work on evaluating generative models, little research has been done on the quality evaluation of an individual generated sample. To address this problem, a lightweight generated sample quality evaluation (LGSQE) method is …
View article: UHP-SOT++: An Unsupervised Lightweight Single Object Tracker
UHP-SOT++: An Unsupervised Lightweight Single Object Tracker Open
An enhanced version of UHP-SOT called UHP-SOT++ is proposed for unsupervised, lightweight and high-performance single object tracking in this work. Both UHP-SOT and UHP-SOT++ exploit the discriminative-correlation-filters-based (DCF-based)…