Shiyu Xuan
YOU?
Author Swipe
View article: Diff-MM: Exploring Pre-trained Text-to-Image Generation Model for Unified Multi-modal Object Tracking
Diff-MM: Exploring Pre-trained Text-to-Image Generation Model for Unified Multi-modal Object Tracking Open
Multi-modal object tracking integrates auxiliary modalities such as depth, thermal infrared, event flow, and language to provide additional information beyond RGB images, showing great potential in improving tracking stabilization in compl…
View article: Dataset Distillation for Histopathology Image Classification
Dataset Distillation for Histopathology Image Classification Open
Deep neural networks (DNNs) have exhibited remarkable success in the field of histopathology image analysis. On the other hand, the contemporary trend of employing large models and extensive datasets has underscored the significance of dat…
View article: LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model Open
The capacity of existing human keypoint localization models is limited by keypoint priors provided by the training data. To alleviate this restriction and pursue more general model, this work studies keypoint localization from a different …
View article: Decoupled Optimisation for Long-Tailed Visual Recognition
Decoupled Optimisation for Long-Tailed Visual Recognition Open
When training on a long-tailed dataset, conventional learning algorithms tend to exhibit a bias towards classes with a larger sample size. Our investigation has revealed that this biased learning tendency originates from the model paramete…
View article: Decoupled Contrastive Learning for Long-Tailed Recognition
Decoupled Contrastive Learning for Long-Tailed Recognition Open
Supervised Contrastive Loss (SCL) is popular in visual representation learning. Given an anchor image, SCL pulls two types of positive samples, i.e., its augmentation and other images from the same class together, while pushes negative ima…
View article: Decoupled Contrastive Learning for Long-Tailed Recognition
Decoupled Contrastive Learning for Long-Tailed Recognition Open
Supervised Contrastive Loss (SCL) is popular in visual representation learning. Given an anchor image, SCL pulls two types of positive samples, i.e., its augmentation and other images from the same class together, while pushes negative ima…
View article: Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs Open
Multi-modal Large Language Models (MLLMs) have shown remarkable capabilities in various multi-modal tasks. Nevertheless, their performance in fine-grained image understanding tasks is still limited. To address this issue, this paper propos…
View article: Intra-Inter Camera Similarity for Unsupervised Person Re-Identification
Intra-Inter Camera Similarity for Unsupervised Person Re-Identification Open
Most of unsupervised person Re-Identification (Re-ID) works produce pseudo-labels by measuring the feature similarity without considering the distribution discrepancy among cameras, leading to degraded accuracy in label computation across …
View article: Siamese networks with distractor-reduction method for long-term visual object tracking
Siamese networks with distractor-reduction method for long-term visual object tracking Open
Many trackers which divide the tracking process into two stages have recently been proposed to solve the problem of long-term tracking. Their outstanding performance makes them become one of the mainstream algorithms of long-term tracking.…
View article: Object Tracking in Satellite Videos by Improved Correlation Filters With Motion Estimations
Object Tracking in Satellite Videos by Improved Correlation Filters With Motion Estimations Open
As a new method of Earth observation, video satellite is capable of monitoring specific events on the Earth's surface continuously by providing high-temporal resolution remote sensing images. The video observations enable a variety of new …