Explanipedia

PAM: a propagation-based model for segmenting any 3D objects across multi-modal medical images Open

Zifan Chen, Xinyu Nan, Jiazheng Li, Jie Zhao, Haifeng Li , et al. · 2025

Volumetric segmentation is a major challenge in medical imaging, as current methods require extensive annotations and retraining, limiting transferability across objects. We present PAM, a propagation-based framework that generates 3D segm…

Rethinking Facial Expression Recognition in the Era of Multimodal Large Language Models: Benchmark, Datasets, and Beyond Open

Fan Zhang, Haoxuan Li, Shengju Qian, Xin Wang, Zheng Lian , et al. · 2025

Multimodal Large Language Models (MLLMs) have revolutionized numerous research fields, including computer vision and affective computing. As a pivotal challenge in this interdisciplinary domain, facial expression recognition (FER) has evol…

Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models Open

Hao Wang, Linfeng Pan, Yuan Lu, Zhichao Chen, Tianqiao Liu , et al. · 2025

The design of training objective is central to training time-series forecasting models. Existing training objectives such as mean squared error mostly treat each future step as an independent, equally weighted task, which we found leading …

Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training Open

Yisen Wang, Yichuan Mo, Hongjun Wang, Xia Li, Zhouchen Lin · 2025

Despite the rapid progress of neural networks, they remain highly vulnerable to adversarial examples, for which adversarial training (AT) is currently the most effective defense. While AT has been extensively studied, its practical applica…

On the Limitations and Capabilities of Position Embeddings for Length Generalization Open

Yang Chen, Yitao Liang, Zhouchen Lin · 2025

In Transformers, Position Embeddings (PEs) significantly influence Length Generalization (LG) performance, yet their fundamental role remains unclear. In this work, we investigate the limitations and capabilities of PEs in achieving LG. We…

Explicit Discovery of Nonlinear Symmetries from Dynamic Data Open

Lianglin Hu, Yikang Li, Zhouchen Lin · 2025

Symmetry is widely applied in problems such as the design of equivariant networks and the discovery of governing equations, but in complex scenarios, it is not known in advance. Most previous symmetry discovery methods are limited to linea…

AI Pangaea: Unifying Intelligence Islands for Adapting Myriad Tasks Open

Jianlong Chang, Haixin Wang, Zhaolong Dang, Li Huang, Zhiyu Wang , et al. · 2025

The pursuit of artificial general intelligence continuously demands generalization in one model across myriad tasks, even those not seen before. However, current AI models are isolated from each other for being limited to specific tasks, n…

A Self-Ensemble Inspired Approach for Effective Training of Binary-Weight Spiking Neural Networks Open

Qingyan Meng, Mingqing Xiao, Zhengyu Ma, Huihui Zhou, Yonghong Tian , et al. · 2025

Spiking Neural Networks (SNNs) are a promising approach to low-power applications on neuromorphic hardware due to their energy efficiency. However, training SNNs is challenging because of the non-differentiable spike generation function. T…

Proximity Matters: Local Proximity Enhanced Balancing for Treatment Effect Estimation Open

Hao Wang, Zhichao Chen, Zhaoran Liu, Xu Chen, Haoxuan Li , et al. · 2025

DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD Open

Xianbiao Qi, M.T. Chen, Wenjie Xiao, Jiaquan Ye, Yelin He , et al. · 2025

Transformers have become the de facto backbone of modern deep learning, yet their training typically demands an advanced optimizer with adaptive learning rate like AdamW, rather than a momentum SGDW (mSGDW). Previous works show that it is …

AV-NAS: Audio-Visual Multi-Level Semantic Neural Architecture Search for Video Hashing Open

Yong Chen, Yuxiang Zhou, Hailiang Dong, Rui Liu, Zhouchen Lin , et al. · 2025

Simple Convergence Proof of Adam From a Sign-like Descent Perspective Open

Hanyang Peng, Shuang Qin, Yue Yu, Feng Jiang, Hui Wang , et al. · 2025

Adam is widely recognized as one of the most effective optimizers for training deep neural networks (DNNs). Despite its remarkable empirical success, its theoretical convergence analysis remains unsatisfactory. Existing works predominantly…

Machine Learning Models to Predict Individual Cognitive Load in Collaborative Learning: Combining fNIRS and Eye-Tracking Data Open

Wenli Chen, Zhouchen Lin, Lishan Zheng, Mei-Yee Mavis Ho, Farhan Ali , et al. · 2025

Effectively leveraging cognitive load predictions helps optimize collaborative learning design and implementation. This study explored the feasibility of predicting individual learners’ cognitive load during collaborative learning using a …

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios Open

Yang Shi, Huanqian Wang, Wulin Xie, Huanyao Zhang, Lijie Zhao , et al. · 2025

Multimodal Large Language Models (MLLMs) have achieved considerable accuracy in Optical Character Recognition (OCR) from static images. However, their efficacy in video OCR is significantly diminished due to factors such as motion blur, te…

Time-o1: Time-Series Forecasting Needs Transformed Label Alignment Open

Hao Wang, Linfeng Pan, Zhichao Chen, Xu Chen, Qingyang Dai , et al. · 2025

Training time-series forecast models presents unique challenges in designing effective learning objectives. Existing methods predominantly utilize the temporal mean squared error, which faces two critical challenges: (1) label autocorrelat…

On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm Open

Huan Li, Yazhu Dong, Zhouchen Lin · 2025

As the default optimizer for training large language models, AdamW has achieved remarkable success in deep learning. However, its convergence behavior is not theoretically well-understood. This paper establishes the convergence rate $\frac…

A Novel SHAP-GAN Network for Interpretable Ovarian Cancer Diagnosis Open

Jinhai Cai, Zne-Jung Lee, Zhouchen Lin, Ming-Ren Yang · 2025

Ovarian cancer stands out as one of the most formidable adversaries in women’s health, largely due to its typically subtle and nonspecific early symptoms, which pose significant challenges to early detection and diagnosis. Although existin…

Empowering LLMs with Logical Reasoning: A Comprehensive Survey Open

Fengxiang Cheng, Haoxuan Li, Fenrong Liu, Robert van Rooij, Kun Zhang , et al. · 2025

Large language models (LLMs) have achieved remarkable successes on various tasks. However, recent studies have found that there are still significant challenges to the logical reasoning abilities of LLMs, which can be categorized into the …

Optimization design of cross border intelligent marketing management model based on multi layer perceptron-grey wolf optimization convolutional neural network Open

Zhouchen Lin, Jing Yang, Yijie Lian, Yanhong Chen, Zhonghui Huang , et al. · 2025

The cross-border intelligent marketing algorithm based on traditional linear models is relatively single in information feature extraction, making it difficult to effectively handle complex scenarios containing a large amount of implicit i…

High-Rank Irreducible Cartesian Tensor Decomposition and Bases of Equivariant Spaces Open

S. Shao, Yikang Li, Zhouchen Lin, Qinghua Cui · 2024

Irreducible Cartesian tensors (ICTs) play a crucial role in the design of equivariant graph neural networks, as well as in theoretical chemistry and chemical physics. Meanwhile, the design space of available linear operations on tensors th…

An Integrated Algorithm with Feature Selection, Data Augmentation, and XGBoost for Ovarian Cancer Open

Jinhai Cai, Zne-Jung Lee, Zhouchen Lin, Chih-Hung Hsu, Yun Lin · 2024

Ovarian cancer is one of the most aggressive gynecological cancers due to its high invasion and chemoresistance. It not only has a high incidence rate but also tops the list of mortality rates. Its subtle early symptoms make subsequent dia…

GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model Open

Haotong Yang, Xiyuan Wang, Qian Tao, Shu‐Xian Hu, Zhouchen Lin , et al. · 2024

Recent research on integrating Large Language Models (LLMs) with Graph Neural Networks (GNNs) typically follows two approaches: LLM-centered models, which convert graph data into tokens for LLM processing, and GNN-centered models, which us…

Convergence Rate Analysis of LION Open

Yiming Dong, Huan Li, Zhouchen Lin · 2024

The LION (evoLved sIgn mOmeNtum) optimizer for deep neural network training was found by Google via program search, with the simple sign update yet showing impressive performance in training large scale networks. Although previous studies …

Number Cookbook: Number Understanding of Language Models and How to Improve It Open

Haotong Yang, Yi Hu, Shijia Kang, Zhouchen Lin, Muhan Zhang · 2024

Large language models (LLMs) can solve an increasing number of complex reasoning tasks while making surprising mistakes in basic numerical understanding and processing (such as 9.11 > 9.9). The latter ability is essential for tackling comp…

MixCon: A Hybrid Architecture for Efficient and Adaptive Sequence Modeling Open

Xin Xu, Zhouchen Lin · 2024

Sequence modeling is a critical task in various domains such as natural language processing, speech recognition, and time series analysis. The existing models still face challenges in capturing long-range dependencies and efficiently model…

Symmetry Discovery for Different Data Types Open

Lei Hu, Yikang Li, Zhouchen Lin · 2024

Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance. However, constructing equivariant neural networks typically requires prior knowledge of data types and symmetries, whi…

On the Adversarial Transferability of Generalized "Skip Connections" Open

Yisen Wang, Yichuan Mo, Dongxian Wu, Mingjie Li, Xingjun Ma , et al. · 2024

Skip connection is an essential ingredient for modern deep models to be deeper and more powerful. Despite their huge success in normal scenarios (state-of-the-art classification performance on natural examples), we investigate and identify…

Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization Open

Yang Chen, Yitao Liang, Zhouchen Lin · 2024

Low-Dimension-to-High-Dimension (LDHD) generalization is a special case of Out-of-Distribution (OOD) generalization, where the training data are restricted to a low-dimensional subspace of the high-dimensional testing space. Assuming that …

Pyramidal Flow Matching for Efficient Video Generative Modeling Open

Yang Jin, Zhicheng Sun, Ningyuan Li, Kun Xu, Kun Xu , et al. · 2024

Video generation requires modeling a vast spatiotemporal space, which demands significant computational resources and data usage. To reduce the complexity, the prevailing approaches employ a cascaded architecture to avoid direct training w…

Incorporating Arbitrary Matrix Group Equivariance into KANs Open

Lexiang Hu, Yisen Wang, Zhouchen Lin · 2024

Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains thanks to spline activation functions, becoming an alternative to Multi-Layer Perceptrons (MLPs). However, spline functions may not respect symmetry in tasks, …

Zhouchen Lin YOU? Author Swipe