Duoqian Miao
YOU?
Author Swipe
View article: Cross-Modal Distillation For Widely Differing Modalities
Cross-Modal Distillation For Widely Differing Modalities Open
Deep learning achieved great progress recently, however, it is not easy or efficient to further improve its performance by increasing the size of the model. Multi-modal learning can mitigate this challenge by introducing richer and more di…
View article: Transformer-Based Person Search with High-Frequency Augmentation and Multi-Wave Mixing
Transformer-Based Person Search with High-Frequency Augmentation and Multi-Wave Mixing Open
The person search task aims to locate a target person within a set of scene images. In recent years, transformer-based models in this field have made some progress. However, they still face three primary challenges: 1) the self-attention m…
View article: Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning Open
Current text-to-image diffusion generation typically employs complete-text conditioning. Due to the intricate syntax, diffusion transformers (DiTs) inherently suffer from a comprehension defect of complete-text captions. One-fly complete-t…
View article: MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction
MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction Open
Decoding natural visual scenes from brain activity has flourished, with extensive research in single-subject tasks and, however, less in cross-subject tasks. Reconstructing high-quality images in cross-subject tasks is a challenging proble…
View article: COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism
COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism Open
Early exiting is an effective paradigm for improving the inference efficiency of pre-trained language models (PLMs) by dynamically adjusting the number of executed layers for each sample. However, in most existing works, easy and hard samp…
View article: Two‐Stage Early Exiting From Globality Towards Reliability
Two‐Stage Early Exiting From Globality Towards Reliability Open
Early exiting has shown significant potential in accelerating the inference of pre‐trained language models (PLMs) by allowing easy samples to exit from shallow layers. However, existing early exiting methods primarily rely on local informa…
View article: Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding
Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding Open
Decoding visual information from human brain activity has seen remarkable advancements in recent research. However, the diversity in cortical parcellation and fMRI patterns across individuals has prompted the development of deep learning m…
View article: Real-Time Semantic Segmentation of Road Scenes via Hybrid Dilated Grouping Network
Real-Time Semantic Segmentation of Road Scenes via Hybrid Dilated Grouping Network Open
Article Real-Time Semantic Segmentation of Road Scenes via Hybrid Dilated Grouping Network Yan Zhang 1, Xuguang Zhang 1,*, Deting Miao 1, and Hui Yu 2 1 School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, Zhej…
View article: Flame Image Classification Based on Deep Learning and Three-Way Decision-Making
Flame Image Classification Based on Deep Learning and Three-Way Decision-Making Open
The classification and recognition of flame images play an important role in avoiding forest fires. Deep learning technology has shown good performance in flame image recognition tasks. In order to further improve the accuracy of classific…
View article: MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI
MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI Open
Concept-selective regions within the human cerebral cortex exhibit significant activation in response to specific visual stimuli associated with particular concepts. Precisely localizing these regions stands as a crucial long-term goal in …
View article: Menet: Camouflaged Object Detection with Boundary Localization in Complex Backgrounds
Menet: Camouflaged Object Detection with Boundary Localization in Complex Backgrounds Open
View article: DiaDP@XLLM25: Advancing Chinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring
DiaDP@XLLM25: Advancing Chinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring Open
View article: Menet: Camouflaged Object Detection with Boundary Localization in Complex Backgrounds
Menet: Camouflaged Object Detection with Boundary Localization in Complex Backgrounds Open
View article: COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism
COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism Open
Early exiting is an effective paradigm for improving the inference efficiency of pre-trained language models (PLMs) by dynamically adjusting the number of executed layers for each sample. However, in most existing works, easy and hard samp…
View article: Unveiling user identity across social media: a novel unsupervised gradient semantic model for accurate and efficient user alignment
Unveiling user identity across social media: a novel unsupervised gradient semantic model for accurate and efficient user alignment Open
The field of social network analysis has identified User Alignment (UA) as a crucial area of investigation. The objective of UA is to identify and connect user accounts across diverse social networks, even when there are no explicit interc…
View article: NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction Open
Reconstruction of static visual stimuli from non-invasion brain activity fMRI achieves great success, owning to advanced deep learning models such as CLIP and Stable Diffusion. However, the research on fMRI-to-video reconstruction remains …
View article: FedMinds: Privacy-Preserving Personalized Brain Visual Decoding
FedMinds: Privacy-Preserving Personalized Brain Visual Decoding Open
Exploring the mysteries of the human brain is a long-term research topic in neuroscience. With the help of deep learning, decoding visual information from human brain activity fMRI has achieved promising performance. However, these decodin…
View article: Aspect-Guided Multi-Graph Convolutional Networks for Aspect-based Sentiment Analysis
Aspect-Guided Multi-Graph Convolutional Networks for Aspect-based Sentiment Analysis Open
View article: MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization
MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization Open
Contrastive Language-Image Pretraining (CLIP) has achieved remarkable success, leading to rapid advancements in multimodal studies. However, CLIP faces a notable challenge in terms of inefficient data utilization. It relies on a single con…
View article: Deep multi-view graph clustering with incomplete views
Deep multi-view graph clustering with incomplete views Open
Deep multi-view graph clustering has made good progress in solving large-scale problems. However, existing deep multi-view graph clustering methods suffer from the following issues: (1) How to combine data processing with multi-view cluste…
View article: Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding
Wills Aligner: Multi-Subject Collaborative Brain Visual Decoding Open
Decoding visual information from human brain activity has seen remarkable advancements in recent research. However, the diversity in cortical parcellation and fMRI patterns across individuals has prompted the development of deep learning m…
View article: MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction
MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction Open
Decoding natural visual scenes from brain activity has flourished, with extensive research in single-subject tasks and, however, less in cross-subject tasks. Reconstructing high-quality images in cross-subject tasks is a challenging proble…
View article: HyDiscGAN: A Hybrid Distributed cGAN for Audio-Visual Privacy Preservation in Multimodal Sentiment Analysis
HyDiscGAN: A Hybrid Distributed cGAN for Audio-Visual Privacy Preservation in Multimodal Sentiment Analysis Open
Multimodal Sentiment Analysis (MSA) aims to identify speakers' sentiment tendencies in multimodal video content, raising serious concerns about privacy risks associated with multimodal data, such as voiceprints and facial images. Recent di…
View article: Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector
Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector Open
Multimodal content, such as mixing text with images, presents significant challenges to rumor detection in social media. Existing multimodal rumor detection has focused on mixing tokens among spatial and sequential locations for unimodal r…
View article: Multi‐granularity feature enhancement network for maritime ship detection
Multi‐granularity feature enhancement network for maritime ship detection Open
Due to the characteristics of high resolution and rich texture information, visible light images are widely used for maritime ship detection. However, these images are susceptible to sea fog and ships of different sizes, which can result i…
View article: DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks Open
Early exiting has demonstrated its effectiveness in accelerating the inference of pre-trained language models like BERT by dynamically adjusting the number of layers executed. However, most existing early exiting methods only consider loca…
View article: Conflict Analysis Triggered by Three-Way Decision and Pythagorean Fuzzy Rough Set
Conflict Analysis Triggered by Three-Way Decision and Pythagorean Fuzzy Rough Set Open
Conflict is ubiquitous in human society and has a profound impact on various fields such as the economy, politics, law, and military. Many scholars have focused on exploring the internal mechanisms and potential solutions to conflicts. Not…
View article: Multi-Granularity Detector for Enhanced Small Object Detection Under Sample Imbalance
Multi-Granularity Detector for Enhanced Small Object Detection Under Sample Imbalance Open
View article: Multimodal Federated Learning with Missing Modality via Prototype Mask and Contrast
Multimodal Federated Learning with Missing Modality via Prototype Mask and Contrast Open
In real-world scenarios, multimodal federated learning often faces the practical challenge of intricate modality missing, which poses constraints on building federated frameworks and significantly degrades model inference accuracy. Existin…
View article: Frequency Spectrum is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector
Frequency Spectrum is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector Open
Multimodal content, such as mixing text with images, presents significant challenges to rumor detection in social media. Existing multimodal rumor detection has focused on mixing tokens among spatial and sequential locations for unimodal r…