Related papers: SELF: Self-Evolution with Language Feedback

A Survey on Self-Evolution of Large Language Models

Large language models (LLMs) have significantly advanced in various fields and intelligent agent applications. However, current LLMs that learn from human or external model supervision are costly and may face performance ceilings as task…

Computation and Language · Computer Science 2024-06-04 Zhengwei Tao , Ting-En Lin , Xiancai Chen , Hangyu Li , Yuchuan Wu , Yongbin Li , Zhi Jin , Fei Huang , Dacheng Tao , Jingren Zhou

Self-Improvement of Large Language Models: A Technical Overview and Future Outlook

As large language models (LLMs) continue to advance, improving them solely through human supervision is becoming increasingly costly and limited in scalability. As models approach human-level capabilities in certain domains, human feedback…

Computation and Language · Computer Science 2026-03-27 Haoyan Yang , Mario Xerri , Solha Park , Huajian Zhang , Yiyang Feng , Sai Akhil Kogilathota , Jiawei Zhou

Self-Refine: Iterative Refinement with Self-Feedback

Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through…

Computation and Language · Computer Science 2023-05-29 Aman Madaan , Niket Tandon , Prakhar Gupta , Skyler Hallinan , Luyu Gao , Sarah Wiegreffe , Uri Alon , Nouha Dziri , Shrimai Prabhumoye , Yiming Yang , Shashank Gupta , Bodhisattwa Prasad Majumder , Katherine Hermann , Sean Welleck , Amir Yazdanbakhsh , Peter Clark

Learning to Self-Evolve

We introduce Learning to Self-Evolve (LSE), a reinforcement learning framework that trains large language models (LLMs) to improve their own contexts at test time. We situate LSE in the setting of test-time self-evolution, where a model…

Computation and Language · Computer Science 2026-03-20 Xiaoyin Chen , Canwen Xu , Yite Wang , Boyi Liu , Zhewei Yao , Yuxiong He

The Path of Self-Evolving Large Language Models: Achieving Data-Efficient Learning via Intrinsic Feedback

Reinforcement learning (RL) has demonstrated potential in enhancing the reasoning capabilities of large language models (LLMs), but such training typically demands substantial efforts in creating and annotating data. In this work, we…

Computation and Language · Computer Science 2025-10-06 Hangfan Zhang , Siyuan Xu , Zhimeng Guo , Huaisheng Zhu , Shicheng Liu , Xinrun Wang , Qiaosheng Zhang , Yang Chen , Peng Ye , Lei Bai , Shuyue Hu

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

We explore a method for improving the performance of large language models through self-reflection and reinforcement learning. By incentivizing the model to generate better self-reflections when it answers incorrectly, we demonstrate that a…

Computation and Language · Computer Science 2025-06-02 Shelly Bensal , Umar Jamil , Christopher Bryant , Melisa Russak , Kiran Kamble , Dmytro Mozolevskyi , Muayad Ali , Waseem AlShikh

Self-Adapting Language Models

Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by…

Machine Learning · Computer Science 2025-09-19 Adam Zweiger , Jyothish Pari , Han Guo , Ekin Akyürek , Yoon Kim , Pulkit Agrawal

Can Large Language Models Invent Algorithms to Improve Themselves?: Algorithm Discovery for Recursive Self-Improvement through Reinforcement Learning

Large Language Models (LLMs) have achieved remarkable capabilities, yet their improvement methods remain fundamentally constrained by human design. We present Self-Developing, a framework that enables LLMs to autonomously discover,…

Computation and Language · Computer Science 2025-06-11 Yoichi Ishibashi , Taro Yano , Masafumi Oyamada

From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment

Effective emotional support hinges on understanding users' emotions and needs to provide meaningful comfort during multi-turn interactions. Large Language Models (LLMs) show great potential for expressing empathy; however, they often…

Computation and Language · Computer Science 2025-05-23 Jing Ye , Lu Xiang , Yaping Zhang , Chengqing Zong

Evolving LLMs' Self-Refinement Capability via Synergistic Training-Inference Optimization

Self-Refinement refers to a model's ability to revise its own responses to produce improved outputs. This capability can also serve as a fundamental mechanism for Self-Improvement, for example, by reconstructing datasets with refined…

Computation and Language · Computer Science 2025-10-28 Yongcheng Zeng , Xinyu Cui , Xuanfa Jin , Qirui Mi , Guoqing Liu , Zexu Sun , Mengyue Yang , Dong Li , Weiyu Ma , Ning Yang , Jian Zhao , Jianye Hao , Haifeng Zhang , Jun Wang

DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms

Recently, large language models (LLMs) enhanced by self-reflection have achieved promising performance on machine translation. The key idea is guiding LLMs to generate translation with human-like feedback. However, existing self-reflection…

Computation and Language · Computer Science 2024-06-24 Andong Chen , Lianzhang Lou , Kehai Chen , Xuefeng Bai , Yang Xiang , Muyun Yang , Tiejun Zhao , Min Zhang

Self-Evolved Reward Learning for LLMs

Reinforcement Learning from Human Feedback (RLHF) is a crucial technique for aligning language models with human preferences, playing a pivotal role in the success of conversational models like GPT-4, ChatGPT, and Llama 2. A core challenge…

Computation and Language · Computer Science 2025-06-04 Chenghua Huang , Zhizhen Fan , Lu Wang , Fangkai Yang , Pu Zhao , Zeqi Lin , Qingwei Lin , Dongmei Zhang , Saravan Rajmohan , Qi Zhang

Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution

Human preference alignment can greatly enhance Multimodal Large Language Models (MLLMs), but collecting high-quality preference data is costly. A promising solution is the self-evolution strategy, where models are iteratively trained on…

Machine Learning · Computer Science 2024-12-23 Wentao Tan , Qiong Cao , Yibing Zhan , Chao Xue , Changxing Ding

Large Language Models Cannot Self-Correct Reasoning Yet

Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities across various applications. Nevertheless, concerns persist regarding the accuracy and appropriateness of their…

Computation and Language · Computer Science 2024-03-15 Jie Huang , Xinyun Chen , Swaroop Mishra , Huaixiu Steven Zheng , Adams Wei Yu , Xinying Song , Denny Zhou

Learning to Learn from Language Feedback with Social Meta-Learning

Large language models (LLMs) often struggle to learn from corrective feedback within a conversational context. They are rarely proactive in soliciting this feedback, even when faced with ambiguity, which can make their dialogues feel…

Computation and Language · Computer Science 2026-02-19 Jonathan Cook , Diego Antognini , Martin Klissarov , Claudiu Musat , Edward Grefenstette

Post-Training Large Language Models via Reinforcement Learning from Self-Feedback

Large Language Models (LLMs) often produce plausible but poorly-calibrated answers, limiting their reliability on reasoning-intensive tasks. We present Reinforcement Learning from Self-Feedback (RLSF), a post-training stage that uses the…

Computation and Language · Computer Science 2025-07-30 Carel van Niekerk , Renato Vukovic , Benjamin Matthias Ruppik , Hsien-chin Lin , Milica Gašić

Learn Like Humans: Use Meta-cognitive Reflection for Efficient Self-Improvement

While Large Language Models (LLMs) enable complex autonomous behavior, current agents remain constrained by static, human-designed prompts that limit adaptability. Existing self-improving frameworks attempt to bridge this gap but typically…

Artificial Intelligence · Computer Science 2026-01-21 Xinmeng Hou , Peiliang Gong , Bohao Qu , Wuqi Wang , Qing Guo , Yang Liu

MAF: Multi-Aspect Feedback for Improving Reasoning in Large Language Models

Language Models (LMs) have shown impressive performance in various natural language tasks. However, when it comes to natural language reasoning, LMs still face challenges such as hallucination, generating incorrect intermediate reasoning…

Computation and Language · Computer Science 2023-10-20 Deepak Nathani , David Wang , Liangming Pan , William Yang Wang

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Large Language Models (LLMs) are rapidly surpassing human knowledge in many domains. While improving these models traditionally relies on costly human data, recent self-rewarding mechanisms (Yuan et al., 2024) have shown that LLMs can…

Computation and Language · Computer Science 2024-07-31 Tianhao Wu , Weizhe Yuan , Olga Golovneva , Jing Xu , Yuandong Tian , Jiantao Jiao , Jason Weston , Sainbayar Sukhbaatar

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

Large language models (LLMs) have been applied to a wide range of tasks, including text summarization, web navigation, and chatbots. They have benefitted from supervised fine-tuning (SFT) and reinforcement learning from human feedback…

Computation and Language · Computer Science 2024-08-07 Ryan Aponte , Ryan A. Rossi , Shunan Guo , Franck Dernoncourt , Tong Yu , Xiang Chen , Subrata Mitra , Nedim Lipka