Explanipedia

Evaluation of PID Performance at CEPC and Optimization with Combined dN/dx and Time-of-Flight Data Open

Dingli Yu, Houqian Ding, Yunyun Fan, Yongfeng Zhu, Ming Qi · 2025

This work presents a comprehensive study of charged-hadron particle identification (PID) at the Circular Electron-Positron Collider (CEPC), based on full simulation of hadronic $Z$-pole events. A unified PID strategy is developed by combin…

PaddleOCR 3.0 Technical Report Open

Cheng Cui, Ting Sun, Manhui Lin, Tingquan Gao, Yubo Zhang , et al. · 2025

This technical report introduces PaddleOCR 3.0, an Apache-licensed open-source toolkit for OCR and document parsing. To address the growing demand for document understanding in the era of large language models, PaddleOCR 3.0 presents three…

Adaptive Token Boundaries: Integrating Human Chunking Mechanisms into Multimodal LLMs Open

Dingli Yu · 2025

Recent advancements in multimodal large language models (MLLMs) have demonstrated remarkable capabilities in processing diverse data types, yet significant disparities persist between human cognitive processes and computational approaches …

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Open

Zhiwei He, Tian Liang, Jiahao Xu, Qiuzhi Liu, Xingyu Chen , et al. · 2025

Reinforcement learning (RL) with large language models shows promise in complex reasoning. However, its progress is hindered by the lack of large-scale training data that is sufficiently challenging, contamination-free and verifiable. To t…

Weak-to-Strong Generalization Even in Random Feature Networks, Provably Open

Mikhail V. Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora, Zhiyuan Li , et al. · 2025

Weak-to-Strong Generalization (Burns et al., 2024) is the phenomenon whereby a strong student, say GPT-4, learns a task from a weak teacher, say GPT-2, and ends up significantly outperforming the teacher. We show that this phenomenon does …

Optimizing wind turbine blade pitch control via input output differential model free adaptive control Open

Ziang Zhou, Shuangxin Wang, Jiading Jiang, Hongrui Li, Juchao Zhao , et al. · 2025

In the context of wind energy systems, maintaining optimal power output in wind turbines when wind speeds exceed rated values necessitates precise regulation of blade pitch through the pitch control system. However, challenges in accuratel…

AdaGC: Improving Training Stability for Large Language Model Pretraining Open

Guoxia Wang, Shuai Li, Congliang Chen, Jinle Zeng, Jiabin Yang , et al. · 2025

Large Language Models (LLMs) face increasing loss spikes during scaling, undermining training stability and final performance. While gradient clipping mitigates this issue, traditional global approaches poorly handle parameter-specific gra…

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? Open

Simon Park, Abhishek Panigrahi, Yun Chen, Dingli Yu, Anirudh Goyal , et al. · 2025

Vision Language Models (VLMs) are impressive at visual question answering and image captioning. But they underperform on multi-step visual reasoning -- even compared to LLMs on the same tasks presented in text form -- giving rise to percep…

ADAPTIVE TOKEN BOUNDARIES: INTEGRATING HUMAN CHUNKING MECHANISMS INTO MULTIMODAL LLMS Open

Dingli Yu · 2025

A Geometric Analysis-Based Safety Assessment Framework for Mass Route Decision-Making in Restricted Waters Open

Zhongli Xu, Zihao Wang, He‐Ping Li, Dingli Yu, Zaili Yang , et al. · 2025

Deterministic Convergence Analysis for GRU Networks via Smoothing Regularization Open

Qian Zhu, Qian Kang, Tao Xu, Dingli Yu, Zhen Wang · 2025

Phi-4 Technical Report Open

Marah Abdin, Jyoti Aneja, Harkirat Singh Behl, Sébastien Bubeck, Ronen Eldan , et al. · 2024

We present phi-4, a 14-billion parameter language model developed with a training recipe that is centrally focused on data quality. Unlike most language models, where pre-training is based primarily on organic data sources such as web cont…

FlashMask: Efficient and Rich Mask Extension of FlashAttention Open

G Wang, Jie Zeng, Xudong Xiao, Sophie Wu, J.C-H. Yang , et al. · 2024

The computational and memory demands of vanilla attention scale quadratically with the sequence length $N$, posing significant challenges for processing long sequences in Transformer models. FlashAttention alleviates these challenges by el…

Can Models Learn Skill Composition from Examples? Open

Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal, Sanjeev Arora · 2024

As large language models (LLMs) become increasingly advanced, their ability to exhibit compositional generalization -- the capacity to combine learned skills in novel ways not encountered during training -- has garnered significant attenti…

ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty Open

Xindi Wu, Dingli Yu, Yangsibo Huang, Olga Russakovsky, Sanjeev Arora · 2024

Compositionality is a critical capability in Text-to-Image (T2I) models, as it reflects their ability to understand and combine multiple concepts from text descriptions. Existing evaluations of compositional capability rely heavily on huma…

AI-Assisted Generation of Difficult Math Questions Open

Vedant Shah, Dingli Yu, Kaifeng Lyu, Simon Park, Nan Rosemary Ke , et al. · 2024

Current LLM training positions mathematical reasoning as a core capability. With publicly available sources fully tapped, there is unmet demand for diverse and challenging math questions. Relying solely on human experts is both time-consum…

Enhancing the Tracking Performance of Wind Turbine Blade Pitch Angle Control via Model-Free Adaptive Control Algorithm Utilizing Input-Output Differential Open

Ziang Zhou, Shuangxin Wang, Jiading Jiang, Hongrui Li, Juchao Zhao , et al. · 2024

In the context of wind energy systems, maintaining optimal power output in wind turbines when wind speeds exceed rated values necessitates precise regulation of blade pitch through the pitch control system. However, challenges in accuratel…

Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates Open

Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal , et al. · 2024

Public LLMs such as the Llama 2-Chat underwent alignment training and were considered safe. Recently Qi et al. [2024] reported that even benign fine-tuning on seemingly safe datasets can give rise to unsafe behaviors in the models. The cur…

Mathematical Modeling of Operation Loop Ratio and its Effect in Combat Networks Open

Z. Song, Zhichao Cao, Chengli Fan, Shiyu Xu, Dingli Yu · 2024

Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models Open

Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal , et al. · 2023

With LLMs shifting their role from statistical modeling of language to serving as general-purpose AI agents, how should LLM evaluations change? Arguably, a key ability of an AI agent is to flexibly combine, as needed, the basic skills it h…

Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks Open

Greg Yang, Dingli Yu, Zhu Chen, Soufiane Hayou · 2023

By classifying infinite-width neural networks and identifying the *optimal* limit, Tensor Programs IV and V demonstrated a universal way, called $μ$P, for *widthwise hyperparameter transfer*, i.e., predicting optimal hyperparameters of wid…

Research on Freezing of Gait Recognition Method Based on Variational Mode Decomposition Open

Shoutao Li, Ruyi Qu, Yu Zhang, Dingli Yu · 2023

Freezing of Gait (FOG) is the most common and disabling gait disorder in patients with Parkinson’s Disease (PD), which seriously affects the life quality and social function of patients. This paper proposes a FOG recognition method based o…

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound Open

Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora · 2022

Saliency methods compute heat maps that highlight portions of an input that were most {\em important} for the label assigned to it by a deep net. Evaluations of saliency methods convert this heat map into a new {\em masked input} by retain…

A Kernel-Based View of Language Model Fine-Tuning Open

Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora · 2022

It has become standard to solve NLP tasks by fine-tuning pre-trained language models (LMs), especially in low-data settings. There is minimal theoretical understanding of empirical success, e.g., why fine-tuning a model with $10^8$ or more…

Program and Organizing Committees Open

Baibing Li, Bing Hull, Da‐Wei Gu, Dayou Li, Dingli Yu · 2022

Table of Contents Open

Eduards Krupenkins, Qichun Zhang, Gao Cong, Sangeet Saha, Yufan Lu , et al. · 2022

Pitch angle control with fault diagnosis and tolerance for wind turbine generation systems Open

Yiran Shi, Shoutao Li, Shuangxin Wang, Yujia Zhai, Yantao Tian , et al. · 2021

To enhance the reliability of wind turbine generation systems that are generally located in the remote area and subjected to harsh environment, we design the pitch angle control for variable speed wind turbines with the function of fault d…

A New Concept of Fractional Order Cumulant and It-Based Signal Processing in α and/or Gaussian Noise Open

Yiran Shi, Dingli Yu, Hongyan Shi, Yao Shi · 2020

In this article, the concept and definitions of the Fractional Order Moment (FOM) and Fractional Order Cumulant (FOC) are proposed, which is based on the fractional derivative of the fractional order Moment-generating function and the frac…

Adaptive Sliding Mode Control of Lateral Stability of Four Wheel Hub Electric Vehicles Open

Shoutao Li, Hui Liu, Di Zhao, Qiu-Yuan Li, Yantao Tian , et al. · 2020

Distributed Formation Control for Multi-Vehicle Systems with Splitting and Merging Capability Open

Szilard Novoth, Qian Zhang, Kang Ji, Dingli Yu · 2020

This letter develops a novel strategy for splitting and merging of agents travelling in formation. The method converts the formation control problem into an optimization problem, which is solved among the agents in a distributed fashion. T…

Dingli Yu YOU? Author Swipe