Explanipedia

A Multimodal Deep Learning Approach for White Matter Shape Prediction in Diffusion MRI Tractography Open

Yui Lo, Yuqian Chen, Dongnan Liu, Leo Zekelman, Jarrett Rushmore , et al. · 2025

Recently, shape measures have emerged as promising descriptors of white matter tractography, offering complementary insights into anatomical variability and associations with cognitive and clinical phenotypes. However, conventional methods…

RealSyn : An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm Open

Tiancheng Gu, Kaicheng Yang, Chaoyi Zhang, Yin Xie, Xiang An , et al. · 2025

ChoreoMuse: Robust Music-to-Dance Video Generation with Style Transfer and Beat-Adherent Motion Open

Xuanchen Wang, Heng Wang, Weidong Cai · 2025

ScSAM: Debiasing Morphology and Distributional Variability in Subcellular Semantic Segmentation Open

B. Fang, Jianan Fan, Dongnan Liu, Hang Chang, Gerald J. Shami , et al. · 2025

The significant morphological and distributional variability among subcellular components poses a long-standing challenge for learning-based organelle segmentation models, significantly increasing the risk of biased feature learning. Exist…

MotionBeat: Motion-Aligned Music Representation via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding Open

X. Wang, Heng Wang, Weidong Cai · 2025

Music is both an auditory and an embodied phenomenon, closely linked to human motion and naturally expressed through dance. However, most existing audio representations neglect this embodied dimension, limiting their ability to capture rhy…

UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning Open

Tiancheng Gu, Kai-Cheng Yang, Kaichen Zhang, Xiang An, Ziyong Feng , et al. · 2025

Universal multimodal embedding models are foundational to various tasks. Existing approaches typically employ in-batch negative mining by measuring the similarity of query-candidate pairs. However, these methods often struggle to capture s…

Improving Multimodal Brain Encoding Model with Dynamic Subject-awareness Routing Open

Xuanhua Yin, Runkai Zhao, Weidong Cai · 2025

Naturalistic fMRI encoding must handle multimodal inputs, shifting fusion styles, and pronounced inter-subject variability. We introduce AFIRE (Agnostic Framework for Multimodal fMRI Response Encoding), an agnostic interface that standardi…

Study of Sex Differences in the Whole Brain White Matter Using Diffusion MRI Tractography and Suprathreshold Fiber Cluster Statistics Open

Fan Zhang, R. Jarrett Rushmore, Yijie Li, Suheyla Cetin‐Karayumak, Yang Song , et al. · 2025

Sex-specific characteristics demonstrate a substantial influence on the human brain white matter, suggesting distinct brain structural connectivity patterns between females and males. Diffusion MRI (dMRI) tractography is an important tool …

Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models Open

Khoa D. Nguyen, Hoang Tran, Anh-Dung Dinh, Daochang Liu, Weidong Cai , et al. · 2025

Autoregressive (AR) models based on next-scale prediction are rapidly emerging as a powerful tool for image generation, but they face a critical weakness: information inconsistencies between patches across timesteps introduced by progressi…

ScSAM: Debiasing Morphology and Distributional Variability in Subcellular Semantic Segmentation Open

B. Fang, Jianan Fan, Dongnan Liu, Hang Chang, Gerald J. Shami , et al. · 2025

The significant morphological and distributional variability among subcellular components poses a long-standing challenge for learning-based organelle segmentation models, significantly increasing the risk of biased feature learning. Exist…

A Survey of 3D Reconstruction with Event Cameras Open

Chuanzhi Xu, Hong Zhou, L. Chen, Haodong Chen, Ying Zhou , et al. · 2025

Event cameras are rapidly emerging as powerful vision sensors for 3D reconstruction, uniquely capable of asynchronously capturing per-pixel brightness changes. Compared to traditional frame-based cameras, event cameras produce sparse yet t…

Diversity-Augmented Diffusion Network With LLM Assistance For Radiology Report Generation Open

Jieting Long, Zhiyuan Li, Jianan Fan, Zhuonan Liang, Ao Ma , et al. · 2025

Seek Inner: LLM-Enhanced Information Mining for Medical Visual Question Answering Open

Ao Ma, Zhiyuan Li, Zhuonan Liang, Tiancheng Gu, Jianan Fan , et al. · 2025

LLM-UM: The 1st Workshop on Large Language Model Using Multi-modal Data for User Modeling Open

Zhicheng Lu, Mohammad Ali Moni, Yuk Ying Chung, Weidong Cai, Xiaoming Chen · 2025

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs Open

Tiancheng Gu, Kai-Cheng Yang, Ziyong Feng, Xingjun Wang, Yanzhao Zhang , et al. · 2025

The Contrastive Language-Image Pre-training (CLIP) framework has become a widely used approach for multimodal representation learning, particularly in image-text retrieval and clustering. However, its efficacy is constrained by three key l…

CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination Open

Kaicheng Yang, T. Gu, Xiang An, Haiqiang Jiang, Xiangzi Dai , et al. · 2025

Contrastive Language-Image Pre-training (CLIP) has achieved excellent performance over a wide range of tasks. However, the effectiveness of CLIP heavily relies on a substantial corpus of pre-training data, resulting in notable consumption …

TractCloud‐FOV: Deep Learning‐Based Robust Tractography Parcellation in Diffusion MRI With Incomplete Field of View Open

Yuqian Chen, Leo Zekelman, Yui Lo, Suheyla Cetin‐Karayumak, Tengfei Xue , et al. · 2025

Tractography parcellation classifies streamlines reconstructed from diffusion MRI into anatomically defined fiber tracts for clinical and research applications. However, clinical scans often have incomplete fields of view (FOV) where brain…

The Shape of the Brain's Connections Is Predictive of Cognitive Performance: An Explainable Machine Learning Study Open

Yui Lo, Yuqian Chen, Dongnan Liu, Wan Liu, Leo Zekelman , et al. · 2025

The shape of the brain's white matter connections is relatively unexplored in diffusion magnetic resonance imaging (dMRI) tractography analysis. While it is known that tract shape varies in populations and across the human lifespan, it is …

Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding Open

Shunqi Mao, Chaoyi Zhang, Weidong Cai · 2025

Existing vision-language models (VLMs) often suffer from visual hallucination, where the generated responses contain inaccuracies that are not grounded in the visual input. Efforts to address this issue without model finetuning primarily m…

Efficient 4D fMRI ASD Classification using Spatial-Temporal-Omics-based Learning Framework Open

Ziqiao Weng, Weidong Cai, Bo Zhou · 2025

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder impacting social and behavioral development. Resting-state fMRI, a non-invasive tool for capturing brain connectivity patterns, aids in early ASD diagnosis and differentiation…

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm Open

Tiancheng Gu, Kai-Cheng Yang, C. ZHANG, Yin Xie, Xiang An , et al. · 2025

After pre-training on extensive image-text pairs, Contrastive Language-Image Pre-training (CLIP) demonstrates promising performance on a wide variety of benchmarks. However, a substantial volume of multimodal interleaved documents remains …

Cross-Domain Fiber Cluster Shape Analysis for Language Performance Cognitive Score Prediction Open

Yui Lo, Yuqian Chen, Dongnan Liu, Wan Liu, Leo Zekelman , et al. · 2025

A Modified Quechers Method Using Single-Sorbent Combined with Lc-Ms/Ms for Simultaneous Determination of Four Phenolic Pesticide Residues in Fruits and Vegetables Open

Yan Chen, Weidong Cai, Zhengfeng Lin, Yuan Ma, Yuexian Wu , et al. · 2025

Multimodal Causal Reasoning Benchmark: Challenging Multimodal Large Language Models to Discern Causal Links Across Modalities Open

Zhiyuan Li, Heng Wang, Dongnan Liu, Chaoyi Zhang, Ao Ma , et al. · 2025

MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention Open

Tianyi Wang, Jianan Fan, Dingxin Zhang, Dongnan Liu, Yong Xia , et al. · 2025

Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease. Multi-modal self-supervised learning has demonstrated remarkable potential in learning patholo…

Cross-View Consistency Regularisation for Knowledge Distillation Open

Weijia Zhang, Dongnan Liu, Weidong Cai, Chao Ma · 2024

Knowledge distillation (KD) is an established paradigm for transferring privileged knowledge from a cumbersome model to a lightweight and efficient one. In recent years, logit-based KD methods are quickly catching up in performance with th…

Gotta Hear Them All: Towards Sound Source Aware Audio Generation Open

Wei Guo, Heng Wang, Jianbo Ma, Weidong Cai · 2024

Audio synthesis has broad applications in multimedia. Recent advancements have made it possible to generate relevant audios from inputs describing an audio scene, such as images or texts. However, the immersiveness and expressiveness of th…

Cell as Point: One-Stage Framework for Efficient Cell Tracking Open

Y. Song, Jianan Fan, Heng Huang, Mei Chen, Weidong Cai · 2024

Conventional multi-stage cell tracking approaches rely heavily on detection or segmentation in each frame as a prerequisite, requiring substantial resources for high-quality segmentation masks and increasing the overall prediction time. To…

ORID: Organ-Regional Information Driven Framework for Radiology Report Generation Open

Tiancheng Gu, Kai‐Cheng Yang, Xiang An, Zhenzhen Feng, Dongnan Liu , et al. · 2024

The objective of Radiology Report Generation (RRG) is to automatically generate coherent textual analyses of diseases based on radiological images, thereby alleviating the workload of radiologists. Current AI-based methods for RRG primaril…

AI-Powered cellular morphometric biomarkers discovered in needle biopsy of prostatic cancer predict neoadjuvant androgen deprivation therapy response and prognosis: an international multicenter retrospective study Open

Hong Yan, Aiqin Mao, Dan Li, Manuel Jesús Pérez‐Baena, Alejandro Jiménez‐Navas , et al. · 2024

It is imperative to identify patients with prostate cancer (PCa) who will benefit from androgen receptor signaling inhibitors that can impact quality of life upon prolonged use. Using our extensively-validated artificial-intelligence techn…

Weidong Cai YOU? Author Swipe