English
Related papers

Related papers: Test-time Recursive Thinking: Self-Improvement wit…

200 papers

Large Language Models (LLMs) have exhibited remarkable performance across various natural language processing (NLP) tasks. However, fine-tuning these models often necessitates substantial supervision, which can be expensive and…

Computation and Language · Computer Science 2023-05-25 Jing-Cheng Pang , Pengyuan Wang , Kaiyuan Li , Xiong-Hui Chen , Jiacheng Xu , Zongzhang Zhang , Yang Yu

This paper investigates Reinforcement Learning (RL) on data without explicit labels for reasoning tasks in Large Language Models (LLMs). The core challenge of the problem is reward estimation during inference while not having access to…

Reinforcement learning (RL) is central to improving reasoning in large language models (LLMs) but typically requires ground-truth rewards. Test-Time Reinforcement Learning (TTRL) removes this need by using majority-vote rewards, but relies…

Machine Learning · Computer Science 2025-10-06 Aleksei Arzhantsev , Otmane Sakhi , Flavian Vasile

Verification-guided self-improvement has recently emerged as a promising approach to improving the accuracy of large language model (LLM) outputs. However, existing approaches face a trade-off between inference efficiency and accuracy:…

Computation and Language · Computer Science 2026-03-24 Yuran Li , Di Wu , Benoit Boulet

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without…

Computation and Language · Computer Science 2022-10-26 Jiaxin Huang , Shixiang Shane Gu , Le Hou , Yuexin Wu , Xuezhi Wang , Hongkun Yu , Jiawei Han

Large language models (LLMs) have demonstrated outstanding performance across various tasks, yet they still exhibit limitations such as hallucination, unfaithful reasoning, and toxic content. One potential approach to mitigate these issues…

Computation and Language · Computer Science 2024-07-19 Yuxuan Yao , Han Wu , Zhijiang Guo , Biyan Zhou , Jiahui Gao , Sichun Luo , Hanxu Hou , Xiaojin Fu , Linqi Song

Self-improving large language models (LLMs) -- i.e., to improve the performance of an LLM by fine-tuning it with synthetic data generated by itself -- is a promising way to advance the capabilities of LLMs while avoiding extensive…

Computation and Language · Computer Science 2025-02-20 Yutao Sun , Mingshuai Chen , Tiancheng Zhao , Ruochen Xu , Zilun Zhang , Jianwei Yin

Reinforcement learning (RL) has demonstrated potential in enhancing the reasoning capabilities of large language models (LLMs), but such training typically demands substantial efforts in creating and annotating data. In this work, we…

Computation and Language · Computer Science 2025-10-06 Hangfan Zhang , Siyuan Xu , Zhimeng Guo , Huaisheng Zhu , Shicheng Liu , Xinrun Wang , Qiaosheng Zhang , Yang Chen , Peng Ye , Lei Bai , Shuyue Hu

Recent advancements in large language models (LLMs) have demonstrated that progressive refinement, rather than providing a single answer, results in more accurate and thoughtful outputs. However, existing methods often rely heavily on…

Computation and Language · Computer Science 2024-10-18 Chengyu Du , Jinyi Han , Yizhou Ying , Aili Chen , Qianyu He , Haokun Zhao , Sirui Xia , Haoran Guo , Jiaqing Liang , Zulong Chen , Liangyue Li , Yanghua Xiao

Recent advancements in large language models (LLMs) have catalyzed the development of general-purpose autonomous agents, demonstrating remarkable performance in complex reasoning tasks across various domains. This surge has spurred the…

Computation and Language · Computer Science 2025-04-18 Nearchos Potamitis , Akhil Arora

Despite the success of large language models (LLMs) in various natural language processing (NLP) tasks, the stored knowledge in these models may inevitably be incomplete, out-of-date, or incorrect. This motivates the need to utilize…

Computation and Language · Computer Science 2023-01-03 Hangfeng He , Hongming Zhang , Dan Roth

Test-time Training enables model adaptation using only test questions and offers a promising paradigm for improving the reasoning ability of large language models (LLMs). However, it faces two major challenges: test questions are often…

Computation and Language · Computer Science 2026-03-05 Haoyang He , Zihua Rong , Liangjie Zhao , Yunjia Zhao , Lan Yang , Honggang Zhang

Self-reflection for Large Language Models (LLMs) has gained significant attention. Existing approaches involve models iterating and improving their previous responses based on LLMs' internal reflection ability or external feedback. However,…

Computation and Language · Computer Science 2025-03-04 Liping Liu , Chunhong Zhang , Likang Wu , Chuang Zhao , Zheng Hu , Ming He , Jianping Fan

Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through…

Recent advances in test-time scaling have led to the emergence of thinking LLMs that exhibit self-reflective behaviors and multi-step reasoning. While RL drives this self-improvement paradigm, a recent study (Gandhi et al., 2025) shows that…

Artificial Intelligence · Computer Science 2025-08-22 Aswin RRV , Jacob Dineen , Divij Handa , Md Nayem Uddin , Mihir Parmar , Chitta Baral , Ben Zhou

Large language models (LLMs) remain prone to factual inaccuracies and computational errors, including hallucinations and mistakes in mathematical reasoning. Recent work augmented LLMs with tools to mitigate these shortcomings, but often…

Computation and Language · Computer Science 2025-02-11 Ne Luo , Aryo Pradipta Gema , Xuanli He , Emile van Krieken , Pietro Lesci , Pasquale Minervini

Iterative self-improvement fine-tunes an autoregressive large language model (LLM) on reward-verified outputs generated by the LLM itself. In contrast to the empirical success of self-improvement, the theoretical foundation of this…

Machine Learning · Computer Science 2026-03-23 Chenruo Liu , Yijun Dong , Yiqiu Shen , Qi Lei

Self-training approach for large language models (LLMs) improves reasoning abilities by training the models on their self-generated rationales. Previous approaches have labeled rationales that produce correct answers for a given question as…

Machine Learning · Computer Science 2025-02-07 Jaehyeok Lee , Keisuke Sakaguchi , JinYeong Bak

Can language models improve their reasoning performance without external rewards, using only their own sampled responses for training? We show that they can. We propose Self-evolving Post-Training (SePT), a simple post-training method that…

Machine Learning · Computer Science 2026-05-18 Mengqi Li , Lei Zhao , Anthony Man-Cho So , Ruoyu Sun , Xiao Li

Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities across various applications. Nevertheless, concerns persist regarding the accuracy and appropriateness of their…

Computation and Language · Computer Science 2024-03-15 Jie Huang , Xinyun Chen , Swaroop Mishra , Huaixiu Steven Zheng , Adams Wei Yu , Xinying Song , Denny Zhou
‹ Prev 1 2 3 10 Next ›