Longxu Dou
YOU?
Author Swipe
View article: Training Optimal Large Diffusion Language Models
Training Optimal Large Diffusion Language Models Open
We introduce Quokka, the first systematic scaling law for diffusion language models (DLMs), encompassing both compute-constrained and data-constrained regimes, and studying the key modeling and optimization designs. Quokka is a good friend…
View article: Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format
Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format Open
Generating and voting multiple answers is an effective method to mitigate reasoning inconsistencies of large language models (LLMs). Prior works have shown that multiple reasoning formats outperform a single format when generating multiple…
View article: NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Open
Recent advances in reinforcement learning (RL) have strengthened the reasoning capabilities of vision-language models (VLMs). However, enhancing policy exploration to better scale test-time compute remains largely underexplored. In additio…
View article: Efficient Process Reward Model Training via Active Learning
Efficient Process Reward Model Training via Active Learning Open
Process Reward Models (PRMs) provide step-level supervision to large language models (LLMs), but scaling up training data annotation remains challenging for both humans and LLMs. To address this limitation, we propose an active learning ap…
View article: Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits
Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits Open
The Myers-Briggs Type Indicator (MBTI) is one of the most influential personality theories reflecting individual differences in thinking, feeling, and behaving. MBTI personality detection has garnered considerable research interest and has…
View article: SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types Open
Scientific question answering (SQA) is an important task aimed at answering questions based on papers. However, current SQA datasets have limited reasoning types and neglect the relevance between tables and text, creating a significant gap…
View article: A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios
A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios Open
Game-theoretic scenarios have become pivotal in evaluating the social intelligence of Large Language Model (LLM)-based social agents. While numerous studies have explored these agents in such settings, there is a lack of a comprehensive su…
View article: In-Context Transfer Learning: Demonstration Synthesis by Transferring Similar Tasks
In-Context Transfer Learning: Demonstration Synthesis by Transferring Similar Tasks Open
In-context learning (ICL) is an effective approach to help large language models (LLMs) adapt to various tasks by providing demonstrations of the target task. Considering the high cost of labeling demonstrations, many methods propose synth…
View article: FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats
FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats Open
The table reasoning task aims to answer the question according to the given table. Currently, using Large Language Models (LLMs) is the predominant method for table reasoning. Most existing methods employ a fixed tabular format to represen…
View article: DAC: Decomposed Automation Correction for Text-to-SQL
DAC: Decomposed Automation Correction for Text-to-SQL Open
Text-to-SQL is an important task that helps people obtain information from databases by automatically generating SQL queries. Considering the brilliant performance, approaches based on Large Language Models (LLMs) become the mainstream for…
View article: Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Open
Research on scaling large language models (LLMs) has primarily focused on model parameters and training data size, overlooking the role of vocabulary size. We investigate how vocabulary size impacts LLM scaling laws by training models rang…
View article: RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training Open
The data mixture for large language model pre-training significantly impacts performance, yet how to determine an effective mixture remains unclear. We propose RegMix to automatically identify a high-performing data mixture by formulating …
View article: Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning of Large Language Models
Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning of Large Language Models Open
Numerical reasoning is a vital capability for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) o…
View article: Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes
Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes Open
Numerical reasoning is an essential ability for NLP systems to handle numeric information. Recent research indicates that fine-tuning a small-scale model to learn generating reasoning processes alongside answers can significantly enhance p…
View article: MURRE: Multi-Hop Table Retrieval with Removal for Open-Domain Text-to-SQL
MURRE: Multi-Hop Table Retrieval with Removal for Open-Domain Text-to-SQL Open
The open-domain text-to-SQL task aims to retrieve question-relevant tables from massive databases and generate SQL. However, the performance of current methods is constrained by single-hop retrieval, and existing multi-hop retrieval of ope…
View article: Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL
Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL Open
Currently, the in-context learning method based on large language models (LLMs) has become the mainstream of text-to-SQL research. Previous works have discussed how to select demonstrations related to the user question from a human-labeled…
View article: A Survey of Table Reasoning with Large Language Models
A Survey of Table Reasoning with Large Language Models Open
Table reasoning, which aims to generate the corresponding answer to the question following the user requirement according to the provided table, and optionally a text description of the table, effectively improving the efficiency of obtain…
View article: Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning
Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning Open
Numerical reasoning is vital for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) of questions a…
View article: MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing Open
Text-to-SQL semantic parsing is an important NLP task, which facilitates the interaction between users and the database. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English.…
View article: Controllable Data Augmentation for Context-Dependent Text-to-SQL
Controllable Data Augmentation for Context-Dependent Text-to-SQL Open
The limited scale of annotated data constraints existing context-dependent text-to-SQL models because of the complexity of labeling. The data augmentation method is a commonly used method to solve this problem. However, the data generated …
View article: MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning Open
Prompt-based learning has shown considerable promise in reformulating various downstream tasks as cloze problems by combining original input with a predetermined template. This approach demonstrates its effectiveness, especially in few-sho…
View article: From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning
From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning Open
Fine-tuning language models on tasks with instructions has demonstrated potential in facilitating zero-shot generalization to unseen tasks. In this paper, we introduce a straightforward yet effective method for enhancing instruction tuning…
View article: Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge
Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge Open
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese ben…
View article: A Survey on Table-and-Text HybridQA: Concepts, Methods, Challenges and Future Directions
A Survey on Table-and-Text HybridQA: Concepts, Methods, Challenges and Future Directions Open
Table-and-text hybrid question answering (HybridQA) is a widely used and challenging NLP task commonly applied in the financial and scientific domain. The early research focuses on migrating other QA task methods to HybridQA, while with fu…
View article: MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing Open
Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL h…
View article: UniSAr: A Unified Structure-Aware Autoregressive Language Model for Text-to-SQL
UniSAr: A Unified Structure-Aware Autoregressive Language Model for Text-to-SQL Open
Existing text-to-SQL semantic parsers are typically designed for particular settings such as handling queries that span multiple tables, domains or turns which makes them ineffective when applied to different settings. We present UniSAr (U…
View article: Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge
Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge Open
Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Min-Yen Kan, Jian-Guang Lou. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022.