Yebowen Hu
YOU?
Author Swipe
View article: STRUX: An LLM for Decision-Making with Structured Explanations
STRUX: An LLM for Decision-Making with Structured Explanations Open
Countless decisions shape our daily lives, and it is paramount to understand the how and why behind these choices. In this paper, we introduce a new LLM decision-making framework called STRUX, which enhances LLM decision-making by providin…
View article: When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives Open
Reasoning is most powerful when an LLM accurately aggregates relevant information. We examine the critical role of information aggregation in reasoning by requiring the LLM to analyze sports narratives. To succeed at this task, an LLM must…
View article: BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models Open
Large Language Models (LLMs) are constrained by outdated information and a tendency to generate incorrect data, commonly referred to as "hallucinations." Retrieval-Augmented Generation (RAG) addresses these limitations by combining the str…
View article: Can Large Language Models do Analytical Reasoning?
Can Large Language Models do Analytical Reasoning? Open
This paper explores the cutting-edge Large Language Model with analytical reasoning on sports. Our analytical reasoning embodies the tasks of letting large language models count how many points each team scores in a quarter in the NBA and …
View article: SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs Open
Large language models hold significant potential for integrating various data types, such as text documents and database records, for advanced analytics. However, blending text and numerical data presents substantial challenges. LLMs need …
View article: DecipherPref: Analyzing Influential Factors in Human Preference Judgments via GPT-4
DecipherPref: Analyzing Influential Factors in Human Preference Judgments via GPT-4 Open
Human preference judgments are pivotal in guiding large language models (LLMs) to produce outputs that align with human values. Human evaluations are also used in summarization tasks to compare outputs from various systems, complementing e…
View article: DecipherPref: Analyzing Influential Factors in Human Preference Judgments via GPT-4
DecipherPref: Analyzing Influential Factors in Human Preference Judgments via GPT-4 Open
Human preference judgments are pivotal in guiding large language models (LLMs) to produce outputs that align with human values. Human evaluations are also used in summarization tasks to compare outputs from various systems, complementing e…