English
Related papers

Related papers: Multi-Method Self-Training: Improving Code Generat…

200 papers

While language models have shown remarkable performance across diverse tasks, they still encounter challenges in complex reasoning scenarios. Recent research suggests that language models trained on linearized search traces toward…

Artificial Intelligence · Computer Science 2025-10-28 Seungyong Moon , Bumsoo Park , Hyun Oh Song

Mechanisms for continued self-improvement of language models without external supervision remain an open challenge. We propose Peer-Predictive Self-Training (PST), a label-free fine-tuning framework in which multiple language models improve…

Computation and Language · Computer Science 2026-04-28 Shi Feng , Hanlin Zhang , Fan Nie , Sham Kakade , Yiling Chen

The advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource…

Software Engineering · Computer Science 2025-02-03 Alessandro Giagnorio , Alberto Martin-Lopez , Gabriele Bavota

The self-improving ability of large language models (LLMs), enabled by prompting them to analyze and revise their own outputs, has garnered significant interest in recent research. However, this ability has been shown to be absent and…

Computation and Language · Computer Science 2024-04-02 Xiao Yu , Baolin Peng , Michel Galley , Jianfeng Gao , Zhou Yu

Recent advancements in large language models (LLMs) have demonstrated their remarkable capabilities across various language tasks. Inspired by the success of text-to-text translation refinement, this paper investigates how LLMs can improve…

Computation and Language · Computer Science 2025-01-28 Huaixia Dou , Xinyu Tian , Xinglin Lyu , Jie Zhu , Junhui Li , Lifan Guo

Multimodal pre-training models, such as LXMERT, have achieved excellent results in downstream tasks. However, current pre-trained models require large amounts of training data and have huge model sizes, which make them difficult to apply in…

Computation and Language · Computer Science 2021-08-02 Tongtong Liu , Fangxiang Feng , Xiaojie Wang

Reinforcement learning from human feedback (RLHF) can improve the quality of large language model's (LLM) outputs by aligning them with human preferences. We propose a simple algorithm for aligning LLMs with human preferences inspired by…

Multi-modal Large Language Model (MLLM) refers to a model expanded from a Large Language Model (LLM) that possesses the capability to handle and infer multi-modal data. Current MLLMs typically begin by using LLMs to decompose tasks into…

Computation and Language · Computer Science 2023-09-01 Yongqiang Zhao , Zhenyu Li , Feng Zhang , Xinhai Xu , Donghong Liu

Recent state-of-the-art language models utilize a two-phase training procedure comprised of (i) unsupervised pre-training on unlabeled text, and (ii) fine-tuning for a specific supervised task. More recently, many studies have been focused…

Computation and Language · Computer Science 2019-11-15 Itzik Malkiel , Lior Wolf

Recent advancements in large language models (LLMs) have significantly improved code generation and program comprehension, accelerating the evolution of software engineering. Current methods primarily enhance model performance by leveraging…

Computation and Language · Computer Science 2025-07-04 Weijie Lyu , Sheng-Jun Huang , Xuan Xia

Self-training approach for large language models (LLMs) improves reasoning abilities by training the models on their self-generated rationales. Previous approaches have labeled rationales that produce correct answers for a given question as…

Machine Learning · Computer Science 2025-02-07 Jaehyeok Lee , Keisuke Sakaguchi , JinYeong Bak

In this paper, we propose a weakly supervised multilingual representation learning framework, called cross-lingual self-training (XLST). XLST is able to utilize a small amount of annotated data from high-resource languages to improve the…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-16 Zi-Qiang Zhang , Yan Song , Ming-Hui Wu , Xin Fang , Li-Rong Dai

Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest that training language models to predict multiple future tokens at once results in higher sample efficiency. More…

Computation and Language · Computer Science 2024-05-01 Fabian Gloeckle , Badr Youbi Idrissi , Baptiste Rozière , David Lopez-Paz , Gabriel Synnaeve

Current multimodal language model (MLM) training approaches overlook the influence of instruction templates. Previous research deals with this problem by leveraging hand-crafted or model-generated templates, failing to investigate the…

Computer Vision and Pattern Recognition · Computer Science 2025-04-09 Shijian Wang , Linxin Song , Jieyu Zhang , Ryotaro Shimizu , Jiarui Jin , Ao Luo , Yuan Lu , Li Yao , Cunjian Chen , Julian McAuley , Wentao Zhang , Hanqian Wu

When training multilingual machine translation (MT) models that can translate to/from multiple languages, we are faced with imbalanced training sets: some languages have much more training data than others. Standard practice is to up-sample…

Computation and Language · Computer Science 2020-09-08 Xinyi Wang , Yulia Tsvetkov , Graham Neubig

Recently, Language Models (LMs) instruction-tuned on multiple tasks, also known as multitask-prompted fine-tuning (MT), have shown the capability to generalize to unseen tasks. Previous work has shown that scaling the number of training…

Computation and Language · Computer Science 2023-02-10 Joel Jang , Seungone Kim , Seonghyeon Ye , Doyoung Kim , Lajanugen Logeswaran , Moontae Lee , Kyungjae Lee , Minjoon Seo

Language model approaches have recently been integrated into binary analysis tasks, such as function similarity detection and function signature recovery. These models typically employ a two-stage training process: pre-training via Masked…

Software Engineering · Computer Science 2024-12-24 Hanxiao Lu , Hongyu Cai , Yiming Liang , Antonio Bianchi , Z. Berkay Celik

Translate-test is a popular technique to improve the performance of multilingual language models. This approach works by translating the input into English using an external machine translation system, and running inference over the…

Computation and Language · Computer Science 2023-08-03 Julen Etxaniz , Gorka Azkune , Aitor Soroa , Oier Lopez de Lacalle , Mikel Artetxe

Pre-trained neural language models bring significant improvement for various NLP tasks, by fine-tuning the models on task-specific training sets. During fine-tuning, the parameters are initialized from pre-trained models directly, which…

Computation and Language · Computer Science 2020-09-17 Chengyu Wang , Minghui Qiu , Jun Huang , Xiaofeng He

Recent language models achieve impressive results in tasks involving complex multistep reasoning, but scaling these capabilities further traditionally requires expensive collection of more annotated data. In this work, we explore the…

Computation and Language · Computer Science 2024-10-25 Marek Kadlčík , Michal Štefánik
‹ Prev 1 2 3 10 Next ›