English
Related papers

Related papers: Optimizing Language Models for Inference Time Obje…

200 papers

Using more test-time computation during language model inference, such as generating more intermediate thoughts or sampling multiple candidate answers, has proven effective in significantly improving model performance. This paper takes an…

Machine Learning · Computer Science 2025-08-20 Xingwu Chen , Miao Lu , Beining Wu , Difan Zou

Machine learning models are often used at test-time subject to constraints and trade-offs not present at training-time. For example, a computer vision model operating on an embedded device may need to perform real-time inference, or a…

Machine Learning · Statistics 2017-02-28 Augustus Odena , Dieterich Lawson , Christopher Olah

The performance of many machine learning models depends on their hyper-parameter settings. Bayesian Optimization has become a successful tool for hyper-parameter optimization of machine learning algorithms, which aims to identify optimal…

Machine Learning · Computer Science 2020-08-04 Lidan Wang , Franck Dernoncourt , Trung Bui

The purpose of this paper is to use reinforcement learning to model learning agents which can recognize formal languages. Agents are modeled as simple multi-head automaton, a new model of finite automaton that uses multiple heads, and six…

Machine Learning · Computer Science 2020-10-21 Alper Şekerci , Özlem Salehi

Test-time scaling (TTS) has enhanced the performance of Reasoning Models (RMs) on various tasks such as math and coding, yet its efficacy in machine translation (MT) remains underexplored. This paper investigates whether increased…

Computation and Language · Computer Science 2026-01-13 Zihao Li , Shaoxiong Ji , Jörg Tiedemann

Scaling model size and training data has led to great advances in the performance of Large Language Models (LLMs). However, the diminishing returns of this approach necessitate alternative methods to improve model capabilities, particularly…

Machine Learning · Computer Science 2025-11-05 Daman Arora , Andrea Zanette

Inference-time computation offers a powerful axis for scaling the performance of language models. However, naively increasing computation in techniques like Best-of-N sampling can lead to performance degradation due to reward hacking.…

Artificial Intelligence · Computer Science 2025-04-09 Audrey Huang , Adam Block , Qinghua Liu , Nan Jiang , Akshay Krishnamurthy , Dylan J. Foster

Leveraging inference-time search in large language models has proven effective in further enhancing a trained model's capability to solve complex mathematical and reasoning problems. However, this approach significantly increases…

Machine Learning · Computer Science 2025-10-29 Tianwei Ni , Allen Nie , Sapana Chaudhary , Yao Liu , Huzefa Rangwala , Rasool Fakoor

A common paradigm to improve the performance of large language models is optimizing for a reward model. Reward models assign a numerical score to an LLM's output that indicates, for example, how likely it is to align with user preferences…

Machine Learning · Computer Science 2025-11-06 Hadi Khalaf , Claudio Mayrink Verdun , Alex Oesterling , Himabindu Lakkaraju , Flavio du Pin Calmon

Large-scale language models achieved state-of-the-art performance over a number of language tasks. However, they fail on adversarial language examples, which are sentences optimized to fool the language models but with similar semantic…

Computation and Language · Computer Science 2023-10-31 Noah Thomas McDermott , Junfeng Yang , Chengzhi Mao

Language models are now prevalent in software engineering with many developers using them to automate tasks and accelerate their development. While language models have been tremendous at accomplishing complex software engineering tasks,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-21 Daniel Nichols , Konstantinos Parasyris , Charles Jekel , Abhinav Bhatele , Harshitha Menon

We conduct experiments on the impact of increasing inference-time compute in reasoning models (specifically OpenAI o1-preview and o1-mini) on their robustness to adversarial attacks. We find that across a variety of attacks, increased…

Sentence compression reduces the length of text by removing non-essential content while preserving important facts and grammaticality. Unsupervised objective driven methods for sentence compression can be used to create customized models…

Computation and Language · Computer Science 2022-05-18 Demian Gholipour Ghalandari , Chris Hokamp , Georgiana Ifrim

Recent years have seen significant advancements in foundation models through generative pre-training, yet algorithmic innovation in this space has largely stagnated around autoregressive models for discrete signals and diffusion models for…

Machine Learning · Computer Science 2025-03-12 Jiaming Song , Linqi Zhou

The internal structure and operation mechanism of large-scale language models are analyzed theoretically, especially how Transformer and its derivative architectures can restrict computing efficiency while capturing long-term dependencies.…

Machine Learning · Computer Science 2024-05-21 Taiyuan Mei , Yun Zi , Xiaohan Cheng , Zijun Gao , Qi Wang , Haowei Yang

Scaling test-time compute has emerged as a powerful mechanism for enhancing Large Language Model (LLM) performance. However, standard post-training paradigms, Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), optimize the…

Machine Learning · Computer Science 2026-05-21 Adam Ousherovitch , Ambuj Tewari

Inference-time computation methods enhance the performance of Large Language Models (LLMs) by leveraging additional computational resources to achieve superior results. Common techniques, such as Best-of-N sampling, Majority Voting, and…

Computation and Language · Computer Science 2024-11-27 Chia-Yu Hung , Navonil Majumder , Ambuj Mehrish , Soujanya Poria

Reinforcement learning can greatly benefit from the use of options as a way of encoding recurring behaviours and to foster exploration. An important open problem is how can an agent autonomously learn useful options when solving particular…

Machine Learning · Computer Science 2020-01-07 Manuel Del Verme , Bruno Castro da Silva , Gianluca Baldassarre

Efficiency in optimisation and search processes persists to be one of the challenges, which affects the performance and use of optimisation algorithms. Utilising a pool of operators instead of a single operator to handle move operations…

Artificial Intelligence · Computer Science 2025-12-12 Mehmet Emin Aydin

Parameter-shared pre-trained language models (PLMs) have emerged as a successful approach in resource-constrained environments, enabling substantial reductions in model storage and memory costs without significant performance compromise.…

Computation and Language · Computer Science 2023-10-20 Weize Chen , Xiaoyue Xu , Xu Han , Yankai Lin , Ruobing Xie , Zhiyuan Liu , Maosong Sun , Jie Zhou
‹ Prev 1 2 3 10 Next ›