Self-Refine: Iterative Refinement with Self-Feedback

Aman Madaan; Niket Tandon; Prakhar Gupta; Skyler Hallinan; Luyu Gao; Sarah Wiegreffe; Uri Alon; Nouha Dziri; Shrimai Prabhumoye; Yiming Yang; Shashank Gupta; Bodhisattwa Prasad Majumder; Katherine Hermann; Sean Welleck; Amir Yazdanbakhsh; Peter Clark

Self-Refine: Iterative Refinement with Self-Feedback

Computation and Language 2023-05-29 v2 Artificial Intelligence Machine Learning

Authors: Aman Madaan , Niket Tandon , Prakhar Gupta , Skyler Hallinan , Luyu Gao , Sarah Wiegreffe , Uri Alon , Nouha Dziri , Shrimai Prabhumoye , Yiming Yang , Shashank Gupta , Bodhisattwa Prasad Majumder , Katherine Hermann , Sean Welleck , Amir Yazdanbakhsh , Peter Clark

View on arXiv ↗ PDF ↗

Abstract

Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it to refine itself, iteratively. Self-Refine does not require any supervised training data, additional training, or reinforcement learning, and instead uses a single LLM as the generator, refiner, and feedback provider. We evaluate Self-Refine across 7 diverse tasks, ranging from dialog response generation to mathematical reasoning, using state-of-the-art (GPT-3.5, ChatGPT, and GPT-4) LLMs. Across all evaluated tasks, outputs generated with Self-Refine are preferred by humans and automatic metrics over those generated with the same LLM using conventional one-step generation, improving by ~20% absolute on average in task performance. Our work demonstrates that even state-of-the-art LLMs like GPT-4 can be further improved at test time using our simple, standalone approach.

Keywords

instruction tuning large language model large language model evaluation

Cite

@article{arxiv.2303.17651,
  title  = {Self-Refine: Iterative Refinement with Self-Feedback},
  author = {Aman Madaan and Niket Tandon and Prakhar Gupta and Skyler Hallinan and Luyu Gao and Sarah Wiegreffe and Uri Alon and Nouha Dziri and Shrimai Prabhumoye and Yiming Yang and Shashank Gupta and Bodhisattwa Prasad Majumder and Katherine Hermann and Sean Welleck and Amir Yazdanbakhsh and Peter Clark},
  journal= {arXiv preprint arXiv:2303.17651},
  year   = {2023}
}

Comments

Code, data, and demo at https://selfrefine.info/

Self-Refine: Iterative Refinement with Self-Feedback

Abstract

Keywords

Cite

Comments

Related papers