Hongshen Xu
YOU?
Author Swipe
View article: Developing ChemDFM as a large language foundation model for chemistry
Developing ChemDFM as a large language foundation model for chemistry Open
View article: Alignment for Efficient Tool Calling of Large Language Models
Alignment for Efficient Tool Calling of Large Language Models Open
View article: CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning Analysis
CLaw: Benchmarking Chinese Legal Knowledge in Large Language Models - A Fine-grained Corpus and Reasoning Analysis Open
View article: MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation
MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation Open
View article: Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? Open
Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VL…
View article: Sparsity-Accelerated Training for Large Language Models
Sparsity-Accelerated Training for Large Language Models Open
Large language models (LLMs) have demonstrated proficiency across various natural language processing (NLP) tasks but often require additional training, such as continual pre-training and supervised fine-tuning. However, the costs associat…
View article: CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions
CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions Open
Recently, Large Language Models (LLMs) have been demonstrated to possess impressive capabilities in a variety of domains and tasks. We investigate the issue of prompt design in the multi-turn text-to-SQL task and attempt to enhance the LLM…
View article: Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind
Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind Open
Large Language Models (LLMs) have ushered in a new era in Natural Language Processing, but their massive size demands effective compression techniques for practicality. Although numerous model compression techniques have been investigated,…
View article: A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames
A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames Open
Previous work on spoken language understanding (SLU) mainly focuses on single-intent settings, where each input utterance merely contains one user intent. This configuration significantly limits the surface form of user utterances and the …
View article: Developing ChemDFM as a large language foundation model for chemistry
Developing ChemDFM as a large language foundation model for chemistry Open
Artificial intelligence (AI) has played an increasingly important role in chemical research. However, most models currently used in chemistry are specialist models that require training and tuning for specific tasks. A more generic and eff…
View article: ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought
ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought Open
Recently Large Language Models (LLMs) have been proven to have strong abilities in various domains and tasks. We study the problem of prompt designing in the text-to-SQL task and attempt to improve the LLMs' reasoning ability when generati…
View article: Large Language Models Are Semi-Parametric Reinforcement Learning Agents
Large Language Models Are Semi-Parametric Reinforcement Learning Agents Open
Inspired by the insights in cognitive science with respect to human memory and reasoning mechanism, a novel evolvable LLM-based (Large Language Model) agent framework is proposed as REMEMBERER. By equipping the LLM with a long-term experie…
View article: On the Structural Generalization in Text-to-SQL
On the Structural Generalization in Text-to-SQL Open
Exploring the generalization of a text-to-SQL parser is essential for a system to automatically adapt the real-world databases. Previous works provided investigations focusing on lexical diversity, including the influence of the synonym an…
View article: Exploring Schema Generalizability of Text-to-SQL
Exploring Schema Generalizability of Text-to-SQL Open
Exploring the generalizability of a text-to-SQL parser is essential for a system to automatically adapt the real-world databases. Previous investigation works mostly focus on lexical diversity, including the influence of the synonym and pe…
View article: ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought
ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought Open
Recently Large Language Models (LLMs) have been proven to have strong abilities in various domains and tasks. We study the problem of prompt designing in the text-to-SQL task and attempt to improve the LLMs’ reasoning ability when generati…
View article: TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages
TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages Open
Recently, the structural reading comprehension (SRC) task on web pages has attracted increasing research interests. Although previous SRC work has leveraged extra information such as HTML tags or XPaths, the informative topology of web pag…
View article: TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages
TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages Open
Zihan Zhao, Lu Chen, Ruisheng Cao, Hongshen Xu, Xingyu Chen, Kai Yu. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022.
View article: Select
Select Open
The purpose of this thesis is to provide an understanding of players' moral responses toward Non-Player Characters (NPCs) in video gameplay. The main research question for this thesis is what is the difference of moral response toward diff…
View article: Heading control method of unmanned sailing boats based on fuzzy PID
Heading control method of unmanned sailing boats based on fuzzy PID Open
[Objectives] In order to improve the anti-jamming ability and navigation stability of the unmanned sailing boats in the changeable and unknown environment and to realize accurate control of the heading of sailing boats,a fuzzy adapt…