Explanipedia

ICE: Intercede Concept Erasure in Text-to-Image Diffusion Models Open

Yizhou Lin, Nisha Huang, Kaer Huang, H. M. Liu, Yiqiang Yan , et al. · 2025

Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis Open

Zihao Liu, Mei Ou, Zunnan Xu, Jiaqi Huang, Haonan Han , et al. · 2025

A Motion is Worth a Hybrid Sentence: Taming Language Model for Unified Motion Generation by Fine-grained Planning Open

Ronghui Li, Lingxiao Han, Shi Shu, Yin-Long Liu, Yukang Lin , et al. · 2025

ASPO: Asymmetric Importance Sampling Policy Optimization Open

Jihong Wang, Runze Liu, Lin Lei, Wenping Hu, Xiu Li , et al. · 2025

Recent Large Language Model (LLM) post-training methods rely on token-level clipping mechanisms during Reinforcement Learning (RL). However, we identify a fundamental flaw in this Outcome-Supervised RL (OSRL) paradigm: the Importance Sampl…

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models Open

Runze Liu, Jiakang Wang, Yuling Shi, Zhihui Xie, Chenxin An , et al. · 2025

Reinforcement Learning (RL) has shown remarkable success in enhancing the reasoning capabilities of Large Language Models (LLMs). Process-Supervised RL (PSRL) has emerged as a more effective paradigm compared to outcome-based RL. However, …

Mycobacterium tuberculosis infection status and associated factors among household close contacts of rifampicin-resistant pulmonary tuberculosis patients: A single-center cross-sectional study Open

Zhengyu Shi, Juan Peng, Xiu Li, Xiaoyan Fu, Li‐Ping Zou , et al. · 2025

Reversible Authentication Watermarking Based on Improved 2D Histogram and Adaptive Difference Expansion Open

Zhengwei Zhang, Xiu Li, Hao Yue, Fenfen Li · 2025

To address the limitations of low authentication accuracy and ineffective protection for complex-texture images/regions in existing reversible schemes, an improved algorithm based on two-Dimensional (2D) histogram and difference expansion …

Enhancing Online Video Recommendation via a Coarse-to-fine Dynamic Uplift Modeling Framework Open

Chang Meng, Chenhao Zhai, X. Wang, Shuchang Liu, Xiaoqiang Feng , et al. · 2025

Align-Then-stEer: Adapting the Vision-Language Action Models through Unified Latent Guidance Open

Yang Zhang, Chenwei Wang, O ̈sürmeliog ̆lu, Yuan Zhao, Yunfei Ge , et al. · 2025

Vision-Language-Action (VLA) models pre-trained on large, diverse datasets show remarkable potential for general-purpose robotic manipulation. However, a primary bottleneck remains in adapting these models to downstream tasks, especially w…

One policy to rule them all: Handling multiple emergent accidents in nuclear power plants with ensemble-based behavior cloning Open

Aicheng Gong, Mengbei Yan, Shengjie Sun, Kaihe Kong, Jiafei Lyu , et al. · 2025

Development of a multi-indicator risk prediction model for cervical cancer associated with benzo[a]pyrene and nicotine exposure: A multi-omics study integrating toxicological analyses and molecular docking Open

Ning Li, Xiu Li, Yating Xu, Yihan Zhang, Yu Si , et al. · 2025

Risk prediction models based on multi-omics data and machine learning algorithms provide potential reference targets for prognosis prediction and personalised treatment of cervical cancer patients. The results of this study provide importa…

S$^2$-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models Open

Chao-ju Chen, Jianbo Zhu, Feng Xu, Nisha Huang, Mengqiang Wu , et al. · 2025

Classifier-free Guidance (CFG) is a widely used technique in modern diffusion models for enhancing sample quality and prompt adherence. However, through an empirical analysis on Gaussian mixture modeling with a closed-form solution, we obs…

X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention Open

Xiaochen Zhao, Hongyi Xu, Guoxian Song, You Xie, Chenxu Zhang , et al. · 2025

We propose X-NeMo, a novel zero-shot diffusion-based portrait animation pipeline that animates a static portrait using facial movements from a driving video of a different individual. Our work first identifies the root causes of the key is…

Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges Open

Liyuan Chen, Shuoling Liu, Jiangpeng Yan, Xiaoyu Wang, Henglin Liu , et al. · 2025

The advent of foundation models (FMs) - large-scale pre-trained models with strong generalization capabilities - has opened new frontiers for financial engineering. While general-purpose FMs such as GPT-4 and Gemini have demonstrated promi…

Segment Concealed Objects With Incomplete Supervision Open

Chunming He, Kai Li, Yachao Zhang, Ziyun Yang, Youwei Pang , et al. · 2025

Incompletely-Supervised Concealed Object Segmentation (ISCOS) involves segmenting objects that seamlessly blend into their surrounding environments, utilizing incompletely annotated data, such as weak and semi-annotations, for model traini…

SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning Open

Jiaqi Huang, Zunnan Xu, Jun Zhou, Ting Liu, Yicheng Xiao , et al. · 2025

Leveraging multimodal large models for image segmentation has become a prominent research direction. However, existing approaches typically rely heavily on manually annotated datasets that include explicit reasoning processes, which are co…

Large language model-based multimodal system for detecting and grading ocular surface diseases from smartphone images Open

Zhongwen Li, Zhouqian Wang, Xiu Li, Pengfei Zhang, Wenfang Wang , et al. · 2025

Background The development of medical artificial intelligence (AI) models is primarily driven by the need to address healthcare resource scarcity, particularly in underserved regions. Proposing an affordable, accessible, interpretable, and…

CreativeSynth: Cross-Art-Attention for Artistic Image Synthesis With Multimodal Diffusion Open

Nisha Huang, Weiming Dong, Yuxin Zhang, Fan Tang, Ronghui Li , et al. · 2025

Although remarkable progress has been made in image style transfer, style is just one of the components of artistic paintings. Directly transferring extracted style features to natural images often results in outputs with obvious synthetic…

InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation Open

Yukang Lin, Yan Hong, Zhiwei Xu, Xiuwen Li, Chao Xu , et al. · 2025

Recent video generation research has focused heavily on isolated actions, leaving interactive motions-such as hand-face interactions-largely unexamined. These interactions are essential for emerging biometric authentication systems, which …

Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis Open

Zihao Liu, Mei Ou, Zunnan Xu, Jia‐Qi Huang, Haonan Han , et al. · 2025

Automating the synthesis of coordinated bimanual piano performances poses significant challenges, particularly in capturing the intricate choreography between the hands while preserving their distinct kinematic signatures. In this paper, w…

Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation Open

Jiaqi Huang, Zunnan Xu, Ting Liu, Yong Liu, Haonan Han , et al. · 2025

In the domain of computer vision, Parameter-Efficient Tuning (PET) is increasingly replacing the traditional paradigm of pre-training followed by full fine-tuning. PET is particularly favored for its effectiveness in large foundation model…

Combinatorial Optimization Perspective based Framework for Multi-behavior Recommendation Open

Chenhao Zhai, Chang Meng, Yu Yang, Kexin Zhang, Xuhao Zhao , et al. · 2025

IQPFR: An Image Quality Prior for Blind Face Restoration and Beyond Open

Peng Hu, Chunming He, Lei Xu, Jingduo Tian, Sina Farsiu , et al. · 2025

Blind Face Restoration (BFR) addresses the challenge of reconstructing degraded low-quality (LQ) facial images into high-quality (HQ) outputs. Conventional approaches predominantly rely on learning feature representations from ground-truth…

Multi-Omics Analysis Revealed That TAOK1 Can Be Used as a Prognostic Marker and Target in a Variety of Tumors, Especially in Cervical Cancer Open

Ning Li, Xiu Li, Yating Xu, Yu Si, Hongting Zhao , et al. · 2025

TAOK1 serves as a promising prognostic biomarker and potential therapeutic target, especially for cervical cancer. These results support its clinical potential in cancer prognosis and treatment strategies.

LETSmix: a spatially informed and learning-based domain adaptation method for cell-type deconvolution in spatial transcriptomics Open

Yangen Zhan, Yongbing Zhang, Zheqi Hu, Yifeng Wang, Zirui Zhu , et al. · 2025

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Open

Yunfei Pu, Yiming Zhao, Zhicong Tang, Ruihong Yin, H.Q. Ye , et al. · 2025

Multi-layer image generation is a fundamental task that enables users to isolate, select, and edit specific image layers, thereby revolutionizing interactions with generative models. In this paper, we introduce the Anonymous Region Transfo…

Diffusion Models in Low-Level Vision: A Survey Open

Chunming He, Yuqi Shen, Chengyu Fang, Fengyang Xiao, Longxiang Tang , et al. · 2025

Deep generative models have gained considerable attention in low-level vision tasks due to their powerful generative capabilities. Among these, diffusion model-based approaches, which employ a forward diffusion process to degrade an image …

Integrating Extra Modality Helps Segmentor Find Camouflaged Objects Well Open

Alex Chengyu Fang, Chunming He, Longxiang Tang, Yuelin Zhang, Chenyang Zhu , et al. · 2025

Camouflaged Object Segmentation (COS) remains challenging because camouflaged objects exhibit only subtle visual differences from their backgrounds and single-modality RGB methods provide limited cues, leading researchers to explore multim…

VLP: Vision-Language Preference Learning for Embodied Manipulation Open

Runze Liu, Chenjia Bai, Jiafei Lyu, Steven Sun, Yali Du , et al. · 2025

Reward engineering is one of the key challenges in Reinforcement Learning (RL). Preference-based RL effectively addresses this issue by learning from human feedback. However, it is both time-consuming and expensive to collect human prefere…

STViT+: improving self-supervised multi-camera depth estimation with spatial-temporal context and adversarial geometry regularization Open

Zhuo Chen, Haimei Zhao, Xiaoshuai Hao, Bo Yuan, Xiu Li · 2025

Multi-camera depth estimation has gained significant attention in autonomous driving due to its importance in perceiving complex environments. However, extending monocular self-supervised methods to multi-camera setups introduces unique ch…

Xiu Li YOU? Author Swipe