Explanipedia

Training Optimal Large Diffusion Language Models Open

Jinjie Ni, Qian Liu, Chao Du, Longxu Dou, Hang Yan , et al. · 2025

We introduce Quokka, the first systematic scaling law for diffusion language models (DLMs), encompassing both compute-constrained and data-constrained regimes, and studying the key modeling and optimization designs. Quokka is a good friend…

Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format Open

Dingzirui Wang, Xiang Zhang, Rongyu Cao, Longxu Dou, Xi Luo , et al. · 2025

Generating and voting multiple answers is an effective method to mitigate reasoning inconsistencies of large language models (LLMs). Prior works have shown that multiple reasoning formats outperform a single format when generating multiple…

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Open

Xiangyan Liu, Jinjie Ni, Zijian Wu, Chao Du, Longxu Dou , et al. · 2025

Recent advances in reinforcement learning (RL) have strengthened the reasoning capabilities of vision-language models (VLMs). However, enhancing policy exploration to better scale test-time compute remains largely underexplored. In additio…

Efficient Process Reward Model Training via Active Learning Open

Keyu Duan, Z. Liu, Xin Mao, Tianyu Pang, Changyu Chen , et al. · 2025

Process Reward Models (PRMs) provide step-level supervision to large language models (LLMs), but scaling up training data annotation remains challenging for both humans and LLMs. To address this limitation, we propose an active learning ap…

Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits Open

Bohan Li, J Guan, Longxu Dou, Yun‐Long Feng, Dingzirui Wang , et al. · 2024

The Myers-Briggs Type Indicator (MBTI) is one of the most influential personality theories reflecting individual differences in thinking, feeling, and behaving. MBTI personality detection has garnered considerable research interest and has…

SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types Open

Xuanliang Zhang, Dingzirui Wang, Baoxin Wang, Longxu Dou, Xinyuan Lu , et al. · 2024

Computer science Geography

Scientific question answering (SQA) is an important task aimed at answering questions based on papers. However, current SQA datasets have limited reasoning types and neglect the relevance between tables and text, creating a significant gap…

A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios Open

Xiachong Feng, Longxu Dou, E Li, Qinghao Wang, Haochuan Wang , et al. · 2024

Computer science Economics

Game-theoretic scenarios have become pivotal in evaluating the social intelligence of Large Language Model (LLM)-based social agents. While numerous studies have explored these agents in such settings, there is a lack of a comprehensive su…

In-Context Transfer Learning: Demonstration Synthesis by Transferring Similar Tasks Open

Dingzirui Wang, X. Zhang, Qiguang Chen, Longxu Dou, Xu Xiao , et al. · 2024

Computer science Geography

In-context learning (ICL) is an effective approach to help large language models (LLMs) adapt to various tasks by providing demonstrations of the target task. Considering the high cost of labeling demonstrations, many methods propose synth…

FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats Open

Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Baoxin Wang, Dayong Wu , et al. · 2024

Computer science

The table reasoning task aims to answer the question according to the given table. Currently, using Large Language Models (LLMs) is the predominant method for table reasoning. Most existing methods employ a fixed tabular format to represen…

DAC: Decomposed Automation Correction for Text-to-SQL Open

Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che · 2024

Computer science Engineering

Text-to-SQL is an important task that helps people obtain information from databases by automatically generating SQL queries. Considering the brilliant performance, approaches based on Large Language Models (LLMs) become the mainstream for…

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Open

Chaofan Tao, Qian Liu, Longxu Dou, Niklas Muennighoff, Zhongwei Wan , et al. · 2024

Computer science Mathematics Physics

Research on scaling large language models (LLMs) has primarily focused on model parameters and training data size, overlooking the role of vocabulary size. We investigate how vocabulary size impacts LLM scaling laws by training models rang…

RegMix: Data Mixture as Regression for Language Model Pre-training Open

Qian Liu, Xiaosen Zheng, Niklas Muennighoff, Guangtao Zeng, Longxu Dou , et al. · 2024

Computer science Mathematics Geography

The data mixture for large language model pre-training significantly impacts performance, yet how to determine an effective mixture remains unclear. We propose RegMix to automatically identify a high-performing data mixture by formulating …

Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning of Large Language Models Open

Dingzirui Wang, Longxu Dou, Wenbin Zhang, Junyu Zeng, Wanxiang Che · 2024

Computer science Psychology Philosophy

Numerical reasoning is a vital capability for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) o…

Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes Open

Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che · 2024

Computer science Engineering

Numerical reasoning is an essential ability for NLP systems to handle numeric information. Recent research indicates that fine-tuning a small-scale model to learn generating reasoning processes alongside answers can significantly enhance p…

MURRE: Multi-Hop Table Retrieval with Removal for Open-Domain Text-to-SQL Open

Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Qingfu Zhu, Wanxiang Che · 2024

Computer science

The open-domain text-to-SQL task aims to retrieve question-relevant tables from massive databases and generate SQL. However, the performance of current methods is constrained by single-hop retrieval, and existing multi-hop retrieval of ope…

Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL Open

Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che · 2024

Computer science Political science

Currently, the in-context learning method based on large language models (LLMs) has become the mainstream of text-to-SQL research. Previous works have discussed how to select demonstrations related to the user question from a human-labeled…

A Survey of Table Reasoning with Large Language Models Open

Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Qingfu Zhu, Wanxiang Che · 2024

Computer science Philosophy

Table reasoning, which aims to generate the corresponding answer to the question following the user requirement according to the provided table, and optionally a text description of the table, effectively improving the efficiency of obtain…

Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning Open

Dingzirui Wang, Longxu Dou, Wenbin Zhang, Junyu Zeng, Wanxiang Che · 2023

Computer science Psychology Political science

Numerical reasoning is vital for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) of questions a…

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing Open

Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che , et al. · 2023

Computer science

Text-to-SQL semantic parsing is an important NLP task, which facilitates the interaction between users and the database. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English.…

Controllable Data Augmentation for Context-Dependent Text-to-SQL Open

Dingzirui Wang, Longxu Dou, Wanxiang Che · 2023

Computer science Biology Philosophy

The limited scale of annotated data constraints existing context-dependent text-to-SQL models because of the complexity of labeling. The data augmentation method is a commonly used method to solve this problem. However, the data generated …

MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning Open

Bohan Li, Longxu Dou, Yutai Hou, Yun‐Long Feng, Honglin Mu , et al. · 2023

Computer science Philosophy Chemistry

Prompt-based learning has shown considerable promise in reformulating various downstream tasks as cloze problems by combining original input with a predetermined template. This approach demonstrates its effectiveness, especially in few-sho…

From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning Open

Qian Liu, Fan Zhou, Zhengbao Jiang, Longxu Dou, Min Lin · 2023

Computer science Mathematics Philosophy

Fine-tuning language models on tasks with instructions has demonstrated potential in facilitating zero-shot generalization to unseen tasks. In this paper, we introduce a straightforward yet effective method for enhancing instruction tuning…

Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge Open

Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang , et al. · 2023

Computer science Mathematics Geography

In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese ben…

A Survey on Table-and-Text HybridQA: Concepts, Methods, Challenges and Future Directions Open

Dingzirui Wang, Longxu Dou, Wanxiang Che · 2022

Computer science Engineering Mathematics

Table-and-text hybrid question answering (HybridQA) is a widely used and challenging NLP task commonly applied in the financial and scientific domain. The early research focuses on migrating other QA task methods to HybridQA, while with fu…

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing Open

Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che , et al. · 2022

Computer science

Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL h…

UniSAr: A Unified Structure-Aware Autoregressive Language Model for Text-to-SQL Open

Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Jian–Guang Lou , et al. · 2022

Computer science

Existing text-to-SQL semantic parsers are typically designed for particular settings such as handling queries that span multiple tables, domains or turns which makes them ineffective when applied to different settings. We present UniSAr (U…

Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge Open

Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang , et al. · 2022

Computer science Philosophy

Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Min-Yen Kan, Jian-Guang Lou. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022.

Longxu Dou YOU? Author Swipe