Explanipedia

MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation Open

L. Liu, Yi He, Zhaojie Chu, Xiaofen Xing, Xiangmin Xu · 2025

Generating stylized 3D human motion from speech signals presents substantial challenges, primarily due to the intricate and fine-grained relationships among speech signals, individual styles, and the corresponding body movements. Current s…

CATCH: A Novel Data Synthesis Framework for High Therapy Fidelity and Memory-Driven Planning Chain of Thought in AI Counseling Open

Mengyi Chen, Jingkai Lin, Zhaojie Chu, Xiaofen Xing, Yirong Chen , et al. · 2025

Recently, advancements in AI counseling based on large language models have shown significant progress. However, existing studies employ a one-time generation approach to synthesize multi-turn dialogue samples, resulting in low therapy fid…

HVI-Based Spatial–Frequency-Domain Multi-Scale Fusion for Low-Light Image Enhancement Open

Yuhang Zhang, Huiying Zheng, Xiangmin Xu, Hancheng Zhu · 2025

Low-light image enhancement aims to restore images captured under extreme low-light conditions. Existing methods demonstrate that fusing Fourier transform magnitude and phase information within the RGB color space effectively improves enha…

HD-PPT: Hierarchical Decoding of Content- and Prompt-Preference Tokens for Instruction-based TTS Open

Sihang Nie, Jing Xing, Baiji Liu, Xiangmin Xu · 2025

Large Language Model (LLM)-based Text-to-Speech (TTS) models have already reached a high degree of naturalness. However, the precision control of TTS inference is still challenging. Although instruction-based Text-to-Speech (Instruct-TTS) …

Road Surface State Change Detection Based on Binocular Vision for Autonomous Driving System Open

Liangtian Zhao, Xiangmin Xu, Shanshan Pei, Siyu Chen, Xiyuan Hu , et al. · 2025

Road surface condition monitoring is crucial for enhancing transportation safety and efficiency, with applications in autonomous driving and urban infrastructure management. Existing methods often rely on single-camera setups or manual ins…

New “Quality” Driving New “Governance”: A Dual-Driven Approach to Rural Digital Governance from the Perspective of Adapting Production Relations Open

Xiangmin Xu · 2025

The innovative concept of “new quality productive forces” highlights technological innovation as the core driving force behind high-quality economic development. With the increasing empowerment of modern information technologies such as di…

Bilateral Collaboration with Large Vision-Language Models for Open Vocabulary Human-Object Interaction Detection Open

Yupeng Hu, Changxing Ding, Chang Sun, Shaoli Huang, Xiangmin Xu · 2025

Open vocabulary Human-Object Interaction (HOI) detection is a challenging task that detects all triplets of interest in an image, even those that are not pre-defined in the training set. Existing approaches typically rely on output feature…

S2SBench: A Benchmark for Quantifying Intelligence Degradation in Speech-to-Speech Large Language Models Open

Yuanbo Fang, Haoze Sun, Jun Liu, Tao Zhang, Zenan Zhou , et al. · 2025

End-to-end speech large language models ((LLMs)) extend the capabilities of text-based models to directly process and generate audio tokens. However, this often leads to a decline in reasoning and generation performance compared to text in…

HedgeAgents: A Balanced-aware Multi-agent Financial Trading System Open

Xiangyu Li, Yawen Zeng, Xiaofen Xing, Jin Xu, Xiangmin Xu · 2025

Early autism diagnosis based on path signature and Siamese unsupervised feature compressor Open

Zhuowen Yin, Xinyao Ding, Xin Zhang, Zhengwang Wu, Li Wang , et al. · 2025

Autism spectrum disorder has been emerging as a growing public health threat. Early diagnosis of autism spectrum disorder is crucial for timely, effective intervention and treatment. However, conventional diagnosis methods based on communi…

Revitalising Aging Oocytes: Echinacoside Restores Mitochondrial Function and Cellular Homeostasis Through Targeting <span>GJA1</span>/<span>SIRT1</span> Pathway Open

Liuqing Yang, Xinle Lai, Fangxuan Lin, Nan Shi, Xiangmin Xu , et al. · 2025

As maternal age increases, the decline in oocyte quality emerges as a critical factor contributing to reduced reproductive capacity, highlighting the urgent need for effective strategies to combat oocyte aging. This study investigated the …

Transient Synchronization Stability in Grid-Following Converters: Mechanistic Insights and Technological Prospects—A Review Open

Yang Liu, Lin Zhu, Xiangmin Xu, Donglai Li, Zhiwei Liang , et al. · 2025

This paper investigates the transient synchronization stability mechanisms and technological advancements associated with grid-following (GFL) converters, providing a systematic review of the current research landscape and future direction…

Ion Transport Mechanism in the Sub-Nano Channels of Edge-Capping Modified Transition Metal Carbides/Nitride Membranes Open

Yinan Li, Xiangmin Xu, Xiaofeng Fang, Fang Li · 2025

Edge-capping modified MXene membranes with new channels created by lateral nanosheets are of great research significance. After introducing tripolyphosphate (STPP) to Ti edges of Ti3C2Tx nanosheets and fabricating the STPP-MXene membranes …

Wearable fall risk assessment by discriminating recessive weak foot individual Open

Zhen Song, Jianlin Ou, Shibin Wu, Lin Shu, Qi Fu , et al. · 2025

Background Sensor-based technologies have been widely used in fall risk assessment. To enhance the model's robustness and reliability, it is crucial to analyze and discuss the factors contributing to the misclassification of certain indivi…

Spatial profiling of the interplay between cell type- and vision-dependent transcriptomic programs in the visual cortex Open

Fangming Xie, Saumya Jain, Runzhe Xu, Salwan Butrus, Zhiqun Tan , et al. · 2025

How early sensory experience during “critical periods” of postnatal life affects the organization of the mammalian neocortex at the resolution of neuronal cell types is poorly understood. We previously reported that the functional and mole…

Uncertainty-Aware Cross Entropy for Robust Learning with Noisy Labels Open

Lin Wang, Fang Liu, Xiaofen Xing, Xiangmin Xu, Kailing Guo , et al. · 2025

PsyDT: Using LLMs to Construct the Digital Twin of Psychological Counselor with Personalized Counseling Style for Psychological Counseling Open

Hui Xie, Yirong Chen, Xiaofen Xing, Jingkai Lin, Xiangmin Xu · 2025

CATCH: A Novel Data Synthesis Framework for High Therapy Fidelity and Memory-Driven Planning Chain of Thought in AI Counseling Open

Mengyi Chen, Jingkai Lin, Zhaojie Chu, Xiaofen Xing, Yirong Chen , et al. · 2025

PsyDT: Using LLMs to Construct the Digital Twin of Psychological Counselor with Personalized Counseling Style for Psychological Counseling Open

Hui Xie, Yirong Chen, Xiaofen Xing, Jingkai Lin, Xiangmin Xu · 2024

Currently, large language models (LLMs) have made significant progress in the field of psychological counseling. However, existing mental health LLMs overlook a critical issue where they do not consider the fact that different psychologica…

ViTGaze: gaze following with interaction features in vision transformers Open

Yuehao Song, Xinggang Wang, Jingfeng Yao, Wenyu Liu, J. L. Zhang , et al. · 2024

Gaze following aims to interpret human-scene interactions by predicting the person’s focal point of gaze. Prevailing approaches often adopt a two-stage framework, whereby multi-modality information is extracted in the initial stage for gaz…

Multi-Scale Temporal Transformer For Speech Emotion Recognition Open

Жипенг Ли, Xiaofen Xing, Yuanbo Fang, Weibin Zhang, Hengsheng Fan , et al. · 2024

Speech emotion recognition plays a crucial role in human-machine interaction systems. Recently various optimized Transformers have been successfully applied to speech emotion recognition. However, the existing Transformer architectures foc…

Online Multi-level Contrastive Representation Distillation for Cross-Subject fNIRS Emotion Recognition Open

Zhihui Lai, Chunmei Qing, Junpeng Tan, Wei‐Jie Luo, Xiangmin Xu · 2024

Utilizing functional near-infrared spectroscopy (fNIRS) signals for emotion recognition is a significant advancement in understanding human emotions. However, due to the lack of artificial intelligence data and algorithms in this field, cu…

The Application of Blockchain Technology in the Financial Field Open

Wei Fan, Yuhung Wang, Zicheng Wang, Xiangmin Xu · 2024

The advent of the digital age has made innovative technologies exceptionally important, many research institutions and businesses are continuously increasing their investments in the field of new digital technologies. Blockchain, as one of…

RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation Open

Y Wang, Yawen Zeng, Junjie Liang, Xiaofen Xing, Jin Xu , et al. · 2024

As an extension of machine translation, the primary objective of multi-modal machine translation is to optimize the utilization of visual information. Technically, image information is integrated into multi-modal fusion and alignment as an…

Disentangled Pre-training for Human-Object Interaction Detection Open

Zhuolong Li, Xing’ao Li, Changxing Ding, Xiangmin Xu · 2024

Detecting human-object interaction (HOI) has long been limited by the amount of supervised data available. Recent approaches address this issue by pre-training according to pseudo-labels, which align object regions with HOI triplets parsed…

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On Open

Yang Xu, Changxing Ding, Zhibin Hong, Junhao Huang, Tao Jin , et al. · 2024

Image-based virtual try-on is an increasingly important task for online shopping. It aims to synthesize images of a specific person wearing a specified garment. Diffusion model-based approaches have recently become popular, as they are exc…

Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset Open

Andrea Toaiari, Federico Cunico, Xiangmin Xu, Haralambos Dafas, Alessandro Vinciarelli , et al. · 2024

We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot's perspective, i.e.…

ViTGaze: Gaze Following with Interaction Features in Vision Transformers Open

Yuehao Song, Xinggang Wang, Jingfeng Yao, Wenyu Liu, Jinglin Zhang , et al. · 2024

Gaze following aims to interpret human-scene interactions by predicting the person's focal point of gaze. Prevailing approaches often adopt a two-stage framework, whereby multi-modality information is extracted in the initial stage for gaz…

A joint brain extraction and image quality assessment framework for fetal brain MRI slices Open

Wenhao Zhang, Xin Zhang, Lingyi Li, Lufan Liao, Fenqiang Zhao , et al. · 2024

Brain extraction and image quality assessment are two fundamental steps in fetal brain magnetic resonance imaging (MRI) 3D reconstruction and quantification. However, the randomness of fetal position and orientation, the variability of fet…

A novel double-sided fabric strain sensor array fabricated with a facile and cost-effective process Open

Xiaobin Chen, Zhongliang Zhang, Lin Shu, Xiaoming Tao, Xiangmin Xu · 2024

Electronic textiles face challenge in fabricating stretchable, double-sided circuits with reliable interfaces. In this study, a double-sided strain sensor array was designed and prepared on an elastic fabric substrate by printing the sensi…

Xiangmin Xu YOU? Author Swipe