Related papers: LADDER: Self-Improving LLMs Through Recursive Prob…

RLSR: Reinforcement Learning from Self Reward

Large language models can generate solutions to complex problems, but training them with reinforcement learning typically requires verifiable rewards that are expensive to create and not possible for all domains. We demonstrate that LLMs…

Machine Learning · Computer Science 2025-08-08 Toby Simonds , Kevin Lopez , Akira Yoshiyama , Dominique Garmier

Can Large Language Models Invent Algorithms to Improve Themselves?: Algorithm Discovery for Recursive Self-Improvement through Reinforcement Learning

Large Language Models (LLMs) have achieved remarkable capabilities, yet their improvement methods remain fundamentally constrained by human design. We present Self-Developing, a framework that enables LLMs to autonomously discover,…

Computation and Language · Computer Science 2025-06-11 Yoichi Ishibashi , Taro Yano , Masafumi Oyamada

Thinker: Learning to Think Fast and Slow

Recent studies show that the reasoning capabilities of Large Language Models (LLMs) can be improved by applying Reinforcement Learning (RL) to question-answering (QA) tasks in areas such as math and coding. With a long context length, LLMs…

Computation and Language · Computer Science 2025-10-17 Stephen Chung , Wenyu Du , Jie Fu

TutorLLM: Customizing Learning Recommendations with Knowledge Tracing and Retrieval-Augmented Generation

The integration of AI in education offers significant potential to enhance learning efficiency. Large Language Models (LLMs), such as ChatGPT, Gemini, and Llama, allow students to query a wide range of topics, providing unprecedented…

Information Retrieval · Computer Science 2025-04-29 Zhaoxing Li , Vahid Yazdanpanah , Jindi Wang , Wen Gu , Lei Shi , Alexandra I. Cristea , Sarah Kiden , Sebastian Stein

Teaching LLM to Reason: Reinforcement Learning from Algorithmic Problems without Code

Enhancing reasoning capabilities remains a central focus in the LLM reasearch community. A promising direction involves requiring models to simulate code execution step-by-step to derive outputs for given inputs. However, as code is often…

Computation and Language · Computer Science 2025-07-15 Keqin Bao , Nuo Chen , Xiaoyuan Li , Binyuan Hui , Bowen Yu , Fuli Feng , Xiangnan He , Dayiheng Liu

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation

Despite impressive progress in areas like mathematical reasoning, large language models still face significant challenges in consistently solving complex problems. Drawing inspiration from key human learning strategies, we propose two novel…

Artificial Intelligence · Computer Science 2025-09-18 Enci Zhang , Xingang Yan , Wei Lin , Tianxiang Zhang , Qianchun Lu

RaDeR: Reasoning-aware Dense Retrieval Models

We propose RaDeR, a set of reasoning-based dense retrieval models trained with data derived from mathematical problem solving using large language models (LLMs). Our method leverages retrieval-augmented reasoning trajectories of an LLM and…

Computation and Language · Computer Science 2025-05-28 Debrup Das , Sam O' Nuallain , Razieh Rahimi

LLM Guided Inductive Inference for Solving Compositional Problems

While large language models (LLMs) have demonstrated impressive performance in question-answering tasks, their performance is limited when the questions require knowledge that is not included in the model's training data and can only be…

Computation and Language · Computer Science 2023-09-22 Abhigya Sodani , Lauren Moos , Matthew Mirman

LECTOR: LLM-Enhanced Concept-based Test-Oriented Repetition for Adaptive Spaced Learning

Spaced repetition systems are fundamental to efficient learning and memory retention, but existing algorithms often struggle with semantic interference and personalized adaptation. We present LECTOR (\textbf{L}LM-\textbf{E}nhanced…

Computation and Language · Computer Science 2025-08-06 Jiahao Zhao

LADDER: Revisiting the Cosmic Distance Ladder with Deep Learning Approaches and Exploring its Applications

We investigate the prospect of reconstructing the ''cosmic distance ladder'' of the Universe using a novel deep learning framework called LADDER - Learning Algorithm for Deep Distance Estimation and Reconstruction. LADDER is trained on the…

Cosmology and Nongalactic Astrophysics · Physics 2024-07-29 Rahul Shah , Soumadeep Saha , Purba Mukherjee , Utpal Garain , Supratik Pal

OpenSIR: Open-Ended Self-Improving Reasoner

Recent advances in large language model (LLM) reasoning through reinforcement learning rely on annotated datasets for verifiable rewards, which may limit models' ability to surpass human-level performance. While self-play offers a promising…

Computation and Language · Computer Science 2026-01-01 Wai-Chung Kwan , Joshua Ong Jun Leang , Pavlos Vougiouklis , Jeff Z. Pan , Marco Valentino , Pasquale Minervini

BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence

Retrieval-augmented large language models (LLMs) have demonstrated efficacy in knowledge-intensive tasks such as open-domain QA, addressing inherent challenges in knowledge update and factual inadequacy. However, inconsistencies between…

Computation and Language · Computer Science 2024-05-31 Jiajie Jin , Yutao Zhu , Yujia Zhou , Zhicheng Dou

TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement

Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT). However, careful evaluations by human reveal that the translations produced by LLMs still contain multiple errors. Importantly, feeding back such…

Computation and Language · Computer Science 2024-06-24 Zhaopeng Feng , Yan Zhang , Hao Li , Bei Wu , Jiayu Liao , Wenqiang Liu , Jun Lang , Yang Feng , Jian Wu , Zuozhu Liu

S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Recent studies have demonstrated the effectiveness of LLM test-time scaling. However, existing approaches to incentivize LLMs' deep thinking abilities generally require large-scale data or significant training efforts. Meanwhile, it remains…

Computation and Language · Computer Science 2025-02-19 Ruotian Ma , Peisong Wang , Cheng Liu , Xingyan Liu , Jiaqi Chen , Bang Zhang , Xin Zhou , Nan Du , Jia Li

Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization

Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning and have thus become one of the most important workloads in today's computing landscape. However, deploying LLM inference poses challenges…

Machine Learning · Computer Science 2024-06-21 Jungi Lee , Wonbeom Lee , Jaewoong Sim

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Existing Large Reasoning Models (LRMs) have shown the potential of reinforcement learning (RL) to enhance the complex reasoning capabilities of Large Language Models~(LLMs). While they achieve remarkable performance on challenging tasks…

Artificial Intelligence · Computer Science 2025-03-19 Huatong Song , Jinhao Jiang , Yingqian Min , Jie Chen , Zhipeng Chen , Wayne Xin Zhao , Lei Fang , Ji-Rong Wen

T1: Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

Large language models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks. However, existing approaches mainly rely on imitation learning and struggle to achieve effective test-time scaling. While reinforcement…

Machine Learning · Computer Science 2025-06-16 Zhenyu Hou , Xin Lv , Rui Lu , Jiajie Zhang , Yujiang Li , Zijun Yao , Juanzi Li , Jie Tang , Yuxiao Dong

Learning to Self-Evolve

We introduce Learning to Self-Evolve (LSE), a reinforcement learning framework that trains large language models (LLMs) to improve their own contexts at test time. We situate LSE in the setting of test-time self-evolution, where a model…

Computation and Language · Computer Science 2026-03-20 Xiaoyin Chen , Canwen Xu , Yite Wang , Boyi Liu , Zhewei Yao , Yuxiong He

Test-time Recursive Thinking: Self-Improvement without External Feedback

Modern Large Language Models (LLMs) have shown rapid improvements in reasoning capabilities, driven largely by reinforcement learning (RL) with verifiable rewards. Here, we ask whether these LLMs can self-improve without the need for…

Computation and Language · Computer Science 2026-02-04 Yufan Zhuang , Chandan Singh , Liyuan Liu , Yelong Shen , Dinghuai Zhang , Jingbo Shang , Jianfeng Gao , Weizhu Chen

TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

Test-time Training enables model adaptation using only test questions and offers a promising paradigm for improving the reasoning ability of large language models (LLMs). However, it faces two major challenges: test questions are often…

Computation and Language · Computer Science 2026-03-05 Haoyang He , Zihua Rong , Liangjie Zhao , Yunjia Zhao , Lan Yang , Honggang Zhang