How Much Do Prompting Methods Help LLMs on Quantitative Reasoning with Irrelevant Information?

Song, Seok Hwan; Tavanapong, Wallapak

How Much Do Prompting Methods Help LLMs on Quantitative Reasoning with Irrelevant Information?

File

2024-Tavanapong-HowMuch.pdf (1.98 MB)

Date

2024-10-21

Authors

Song, Seok Hwan

Tavanapong, Wallapak

Publisher

Association for Computing Machinery

Abstract

Real-world quantitative reasoning problems are complex, often including extra information irrelevant to the question (or "IR noise" for short). State-of-the-art (SOTA) prompting methods have increased the Large Language Model's ability for quantitative reasoning on grade-school Math Word Problems (MWPs). To assess how well these SOTA methods handle IR noise, we constructed four new datasets with IR noise, each consisting of 300 problems from each of the four public datasets: MAWPS, ASDiv, SVAMP, and GSM8K, with added IR noise. We called the collection of these new datasets "MPN"--Math Word Problems with IR Noise. We evaluated SOTA prompting methods using MPN. We propose Noise Reduction Prompting (NRP) and its variant (NRP+) to reduce the impact of IR noise. Findings: Our IR noise significantly degrades the performance of Chain-of-Thought (CoT) Prompting on three different backend models: ChatGPT (gpt-3.5-turbo-0613), PaLM2, and Llama3-8B-instruct. Among them, ChatGPT offers the best accuracy on MPN with and without IR noise. With IR noise, performances of CoT, Least-To-Most Prompting, Progressive-Hint Prompting, and Program-aided Language Models with ChatGPT were significantly impacted, each with an average accuracy drop of above 12%. NRP is least impacted by the noise, with a drop in average accuracy to only around 1.9%. Our NRP+ and NRP perform comparably in the presence of IR noise.

Academic or Administrative Unit

Department of Computer Science

Type

conference presentation

Comments

This proceeding is published as Seok Hwan Song and Wallapak Tavanapong. 2024. How Much Do Prompting Methods Help LLMs on Quantitative Reasoning with Irrelevant Information? In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM '24). Association for Computing Machinery, New York, NY, USA, 2128–2137. https://doi.org/10.1145/3627673.3679840

Rights Statement