Cutting Through the Noise: Boosting LLM Performance on Math Word Problems

Ujjwala Anantheswaran , Himanshu Gupta , Kevin Scaria , Shreyas Verma , Chitta Baral , Swaroop Mishra ·

YOU? · · 2024 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2406.15444

Large Language Models (LLMs) excel at various tasks, including solving math word problems (MWPs), but struggle with real-world problems containing irrelevant information. To address this, we propose a prompting framework that generates adversarial variants of MWPs by adding irrelevant variables. We introduce a dataset, PROBLEMATHIC, containing both adversarial and non-adversarial MWPs. Our experiments reveal that LLMs are susceptible to distraction by numerical noise, resulting in an average relative performance drop of ~26% on adversarial MWPs. To mitigate this, we fine-tune LLMs (Llama-2, Mistral) on the adversarial samples from our dataset. Fine-tuning on adversarial training instances improves performance on adversarial MWPs by ~8%, indicating increased robustness to noise and improved ability to identify relevant data for reasoning. Finally, to assess the generalizability of our prompting framework, we introduce GSM-8K-Adv, an adversarial variant of the GSM-8K benchmark. LLMs continue to struggle when faced with adversarial information, reducing performance by up to 6%.

Concepts

Robustness (evolution) Mathematics Computer science Mathematics education Biology Gene Biochemistry

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2406.15444
PDF: https://arxiv.org/pdf/2406.15444
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4400025230

All OpenAlex metadata

Raw OpenAlex JSON

No additional metadata available.

Cutting Through the Noise: Boosting LLM Performance on Math Word Problems Article Swipe

Related Topics To Compare & Contrast

Raw OpenAlex JSON