Solving Quantitative Reasoning Problems with Language Models

Aitor Lewkowycz; Anders Andreassen; David Dohan; Ethan Dyer; Henryk Michalewski; Vinay Ramasesh; Ambrose Slone; Cem Anil; Imanol Schlag; Theo Gutman-Solo; Yuhuai Wu; Behnam Neyshabur; Guy Gur-Ari; Vedant Misra

Solving Quantitative Reasoning Problems with Language Models

Computation and Language 2022-07-04 v2 Artificial Intelligence Machine Learning

Authors: Aitor Lewkowycz , Anders Andreassen , David Dohan , Ethan Dyer , Henryk Michalewski , Vinay Ramasesh , Ambrose Slone , Cem Anil , Imanol Schlag , Theo Gutman-Solo , Yuhuai Wu , Behnam Neyshabur , Guy Gur-Ari , Vedant Misra

View on arXiv ↗ PDF ↗

Abstract

Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks that require quantitative reasoning, such as solving mathematics, science, and engineering problems at the college level. To help close this gap, we introduce Minerva, a large language model pretrained on general natural language data and further trained on technical content. The model achieves state-of-the-art performance on technical benchmarks without the use of external tools. We also evaluate our model on over two hundred undergraduate-level problems in physics, biology, chemistry, economics, and other sciences that require quantitative reasoning, and find that the model can correctly answer nearly a third of them.

Keywords

logical reasoning language modeling large language model

Cite

@article{arxiv.2206.14858,
  title  = {Solving Quantitative Reasoning Problems with Language Models},
  author = {Aitor Lewkowycz and Anders Andreassen and David Dohan and Ethan Dyer and Henryk Michalewski and Vinay Ramasesh and Ambrose Slone and Cem Anil and Imanol Schlag and Theo Gutman-Solo and Yuhuai Wu and Behnam Neyshabur and Guy Gur-Ari and Vedant Misra},
  journal= {arXiv preprint arXiv:2206.14858},
  year   = {2022}
}

Comments

12 pages, 5 figures + references and appendices

Solving Quantitative Reasoning Problems with Language Models

Abstract

Keywords

Cite

Comments

Related papers