Explanipedia

Structure Alignment-driven Cross-Graph Modeling for Functional RNA Design Open

Xiaoyong Pan, Xiaoyong Pan, Jun Wang, Xiaojian Liu, Weimin Zhu , et al. · 2025

RNAs are critical for biological processes, with their biological functions closely tied to their three-dimensional structures. RNA inverse folding, the design of RNA sequences that fold into target 3D structures, is a complex challenge du…

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views Open

Xiangdong Zhang, Shaofeng Zhang, Junchi Yan · 2025

Point cloud learning, especially in a self-supervised way without manual labels, has gained growing attention in both vision and learning communities due to its potential utility in a wide range of applications. Most existing generative ap…

Fast Multi-objective RNA Optimization with Autoregressive Reinforcement Learning Open

Jiaqi Huang, Harrison X. Bai, Yi Fang, Xiaojian Liu, Xiaoyong Pan , et al. · 2025

Computer science Mathematics

Codon optimization is essential in mRNA vaccine development, while existing tools face limitations in the computational efficiency, sequence diversity and universality. To address these challenges, we develop RNAJog (RNA Joint Optimization…

Calibrating Biased Distribution in VFM-derived Latent Space via Cross-Domain Geometric Consistency Open

Yanbiao Ma, Wei Dai, Bowei Liu, Jiayi Chen, Wenke Huang , et al. · 2025

Despite the fast progress of deep learning, one standing challenge is the gap of the observed training samples and the underlying true distribution. There are multiple reasons for the causing of this gap e.g. sampling bias, noise etc. In t…

When Autonomy Goes Rogue: Preparing for Risks of Multi-Agent Collusion in Social Systems Open

Qibing Ren, Shuguo Xie, Long Wei, Zhenfei Yin, Junchi Yan , et al. · 2025

Recent large-scale events like election fraud and financial scams have shown how harmful coordinated efforts by human groups can be. With the rise of autonomous AI systems, there is growing concern that AI-driven groups could also cause si…

TrajTok: Technical Report for 2025 Waymo Open Sim Agents Challenge Open

Zhiyuan Zhang, Xiaosong Jia, Guanyu Chen, Qifeng Li, Junchi Yan · 2025

In this technical report, we introduce TrajTok, a trajectory tokenizer for discrete next-token-prediction based behavior generation models, which combines data-driven and rule-based methods with better coverage, symmetry and robustness, al…

ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs Open

He Feng, Zijun Chen, Xinnian Liang, Ma Tingting, Yicheng Qiu , et al. · 2025

Recent advances in Large Reasoning Models (LRMs) trained with Long Chain-of-Thought (Long CoT) reasoning have demonstrated remarkable cross-domain generalization capabilities. However, the underlying mechanisms supporting such transfer rem…

SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence Open

Ziyang Gong, Wenhao Li, Ou Ma, Songyuan Li, Jiayi Ji , et al. · 2025

Multimodal Large Language Models (MLLMs) have achieved remarkable progress in various multimodal tasks. To pursue higher intelligence in space, MLLMs require integrating multiple spatial capabilities, even for handling simple and normal ta…

Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space Open

Mengqi Li, Changyao Tian, Renqiu Xia, Ning Liao, Weiwei Guo , et al. · 2025

We propose AdapTok, an adaptive temporal causal video tokenizer that can flexibly allocate tokens for different frames based on video content. AdapTok is equipped with a block-wise masking strategy that randomly drops tail tokens of each b…

Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2) Open

Zhenjie Yang, Xiaosong Jia, Qifeng Li, Xue Yang, Mike Yao , et al. · 2025

Reinforcement Learning (RL) can mitigate the causal confusion and distribution shift inherent to imitation learning (IL). However, applying RL to end-to-end autonomous driving (E2E-AD) remains an open problem for its training difficulty, a…

New Evidence of the Two-Phase Learning Dynamics of Neural Networks Open

Zhanpeng Zhou, Yongyi Yang, Mahito Sugiyama, Junchi Yan · 2025

Understanding how deep neural networks learn remains a fundamental challenge in modern machine learning. A growing body of evidence suggests that training dynamics undergo a distinct phase transition, yet our understanding of this transiti…

Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving Open

Qi Liu, Xinhao Zheng, Renqiu Xia, Xiangtong Qi, Qinxiang Cao , et al. · 2025

As a seemingly self-explanatory task, problem-solving has been a significant component of science and engineering. However, a general yet concrete formulation of problem-solving itself is missing. With the recent development of AI-based pr…

Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions Open

Chia Lun Fan, Xiaosong Jia, Yiwen Sun, Yixiao Wang, Jun Wei , et al. · 2025

The rise of foundation models paves the way for generalist robot policies in the physical world. Existing methods relying on text-only instructions often struggle to generalize to unseen scenarios. We argue that interleaved image-text inpu…

Int2Planner: An Intention-based Multi-modal Motion Planner for Integrated Prediction and Planning Open

Xiaolei Chen, Junchi Yan, Wenlong Liao, Tao He, Pai Peng · 2025

Computer science Engineering Materials science

Motion planning is a critical module in autonomous driving, with the primary challenge of uncertainty caused by interactions with other participants. As most previous methods treat prediction and planning as separate tasks, it is difficult…

On the Cone Effect in the Learning Dynamics Open

Zhanpeng Zhou, Yongyi Yang, Jie Ren, Mahito Sugiyama, Junchi Yan · 2025

Understanding the learning dynamics of neural networks is a central topic in the deep learning community. In this paper, we take an empirical perspective to study the learning dynamics of neural networks in real-world settings. Specificall…

DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving Open

Xiaosong Jia, Junqi You, Zhiyuan Zhang, Junchi Yan · 2025

End-to-end autonomous driving (E2E-AD) has emerged as a trend in the field of autonomous driving, promising a data-driven, scalable approach to system design. However, existing E2E-AD methods usually adopt the sequential paradigm of percep…

Rethinking Video Tokenization: A Conditioned Diffusion-based Approach Open

Nancy Y. C. Yang, Pandeng Li, Liming Zhao, Yang Li, Chen-Wei Xie , et al. · 2025

Existing video tokenizers typically use the traditional Variational Autoencoder (VAE) architecture for video compression and reconstruction. However, to achieve good performance, its training process often relies on complex multi-stage tra…

The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training Open

Jinbo Wang, Mingze Wang, Zhanpeng Zhou, Junchi Yan, E Weinan , et al. · 2025

Transformers consist of diverse building blocks, such as embedding layers, normalization layers, self-attention mechanisms, and point-wise feedforward networks. Thus, understanding the differences and interactions among these blocks is imp…

Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection Open

Yi Yu, Xue Yang, Yansheng Li, Zhenjun Han, Feipeng Da , et al. · 2025

Computer science Business Philosophy

Accurately estimating the orientation of visual objects with compact rotated bounding boxes (RBoxes) has become a prominent demand, which challenges existing object detection paradigms that only use horizontal bounding boxes (HBoxes). To e…

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances Open

Yi Yu, Botao Ren, Peiyuan Zhang, Mingxin Liu, Junwei Luo , et al. · 2025

Computer science Mathematics

With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning OOD from point annotations has gained great attention. In this paper, we rethink this challenging ta…

Efficient Packaging Line Object Counting by Cross-Frame Association With Wavelet Convolutions and Trajectory Compensation Open

Long Wei, Yutao Zhu, Yufeng Li, Ming Qian, Xiang Zuo , et al. · 2025

Computer science Mathematics Physics

Real-time object counting in the industry pipeline is critical for improving efficiency and accuracy in industries like manufacturing and logistics. This paper introduces a novel multi-object association method, namely tracking method, whi…

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training Open

Renqiu Xia, Mingsheng Li, Hancheng Ye, Wenjie Wu, Hongbin Zhou , et al. · 2024

Computer science Geography

Despite their proficiency in general tasks, Multi-modal Large Language Models (MLLMs) struggle with automatic Geometry Problem Solving (GPS), which demands understanding diagrams, interpreting symbols, and performing complex reasoning. Thi…

Universal Hamming Weight Preserving Variational Quantum Ansatz Open

Ge Yan, Kaisen Pan, Ruiguang Wang, Moshe Ran, Hongxu Chen , et al. · 2024

Mathematics Physics

Understanding the mathematical properties of variational quantum ansätze is crucial for determining quantum advantage in Variational Quantum Eigensolvers (VQEs). A deeper understanding of ansätze not only enriches theoretical discussions b…

Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation Open

Yan Li, Weiwei Guo, Xue Yang, Ning Liao, Shaofeng Zhang , et al. · 2024

Computer science Psychology Mathematics

In recent years, aerial object detection has been increasingly pivotal in various earth observation applications. However, current algorithms are limited to detecting a set of pre-defined object categories, demanding sufficient annotated t…

Junchi Yan YOU? Author Swipe