English
Related papers

Related papers: Code Comparison Tuning for Code Large Language Mod…

200 papers

Large Language Models (LLMs) have shown remarkable capabilities in processing both natural and programming languages, which have enabled various applications in software engineering, such as requirement engineering, code generation, and…

Software Engineering · Computer Science 2024-01-12 Ziyu Li , Donghwan Shin

Large language models (LLMs) primarily rely on supervised fine-tuning (SFT) as a key method to adapt pre-trained models to domain-specific tasks such as mathematical reasoning. However, standard SFT uniformly penalizes all tokens,…

Computation and Language · Computer Science 2025-10-14 Zhiwen Ruan , Yixia Li , He Zhu , Yun Chen , Peng Li , Yang Liu , Guanhua Chen

Large Language Models (LLMs) have been widely adopted in commercial code completion engines, significantly enhancing coding efficiency and productivity. However, LLMs may generate code with quality issues that violate coding standards and…

Software Engineering · Computer Science 2025-03-20 Yuan Jiang , Yujian Zhang , Liang Lu , Christoph Treude , Xiaohong Su , Shan Huang , Tiantian Wang

Although LLMs are capable of generating functionally correct code, they also tend to produce less energy-efficient code in comparison to human-written solutions. As these inefficiencies lead to higher computational overhead, they are in…

Machine Learning · Computer Science 2026-04-06 Sophie Weidmann , Fernando Castor

Black-box tuning has attracted recent attention due to that the structure or inner parameters of advanced proprietary models are not accessible. Proxy-tuning provides a test-time output adjustment for tuning black-box language models. It…

Machine Learning · Computer Science 2024-07-02 Yuanyang He , Zitong Huang , Xinxing Xu , Rick Siow Mong Goh , Salman Khan , Wangmeng Zuo , Yong Liu , Chun-Mei Feng

Multilingual programming, which involves using multiple programming languages (PLs) in a single project, is increasingly common due to its benefits. However, it introduces cross-language bugs (CLBs), which arise from interactions between…

Software Engineering · Computer Science 2026-04-22 Zengyang Li , Yimeng Li , Binbin Huang , Peng Liang , Ran Mo , Hui Liu , Yutao Ma

Open-sourced large language models (LLMs) have demonstrated remarkable efficacy in various tasks with instruction tuning. However, these models can sometimes struggle with tasks that require more specialized knowledge such as translation.…

Computation and Language · Computer Science 2024-01-23 Jiali Zeng , Fandong Meng , Yongjing Yin , Jie Zhou

Large language models (LLMs) have shown great potential in code-related tasks, yet open-source models lag behind their closed-source counterparts. To bridge this performance gap, existing methods generate vast amounts of synthetic data for…

Computation and Language · Computer Science 2024-08-06 Weijie Lv , Xuan Xia , Sheng-Jun Huang

Real-world applications of large language models (LLMs) in computational social science (CSS) tasks primarily depend on the effectiveness of instruction tuning (IT) or in-context learning (ICL). While IT has shown highly effective at…

Computation and Language · Computer Science 2024-09-24 Taihang Wang , Xiaoman Xu , Yimin Wang , Ye Jiang

Instruction tuning has been used as a promising approach to improve the performance of large language models (LLMs) on unseen tasks. However, current LLMs exhibit limited robustness to unseen instructions, generating inconsistent outputs…

Computation and Language · Computer Science 2024-06-07 Tianyi Lorena Yan , Fei Wang , James Y. Huang , Wenxuan Zhou , Fan Yin , Aram Galstyan , Wenpeng Yin , Muhao Chen

Fine-tuning Large Language Models (LLMs) typically involves updating at least a few billions of parameters. A more parameter-efficient approach is Prompt Tuning (PT), which updates only a few learnable tokens, and differently, In-Context…

Computation and Language · Computer Science 2024-10-23 Tsachi Blau , Moshe Kimhi , Yonatan Belinkov , Alexander Bronstein , Chaim Baskin

A significant amount of research is focused on developing and evaluating large language models for a variety of code synthesis tasks. These include synthesizing code from natural language, synthesizing tests from code, and synthesizing…

The recent advancement of large language models (LLMs) has been achieved through a combo of instruction tuning and human alignment. However, building manually crafted instruction datasets and performing human alignment become the bottleneck…

Computation and Language · Computer Science 2023-10-05 Tao Feng , Zifeng Wang , Jimeng Sun

Continual learning (CL) is a paradigm that aims to replicate the human ability to learn and accumulate knowledge continually without forgetting previous knowledge and transferring it to new tasks. Recent instruction tuning (IT) involves…

Computation and Language · Computer Science 2023-10-24 Zihan Zhang , Meng Fang , Ling Chen , Mohammad-Reza Namazi-Rad

Large Language Models (LLMs) have transformed software development by enabling code generation, automated debugging, and complex reasoning. However, their continued advancement is constrained by the scarcity of high-quality, publicly…

Software Engineering · Computer Science 2025-08-11 Wasi Uddin Ahmad , Aleksander Ficek , Mehrzad Samadi , Jocelyn Huang , Vahid Noroozi , Somshubra Majumdar , Boris Ginsburg

Large Language Models (LLMs) exhibit significant disparities in performance across languages, primarily benefiting high-resource languages while marginalizing underrepresented ones. Continual Pretraining (CPT) has emerged as a promising…

Computation and Language · Computer Science 2025-10-09 Zihao Li , Shaoxiong Ji , Hengyu Luo , Jörg Tiedemann

The number of large language models (LLMs) with varying parameter scales and vocabularies is increasing. While they deliver powerful performance, they also face a set of common optimization needs to meet specific requirements or standards,…

Computation and Language · Computer Science 2024-10-24 Jiayi Wu , Hao Sun , Hengyi Cai , Lixin Su , Shuaiqiang Wang , Dawei Yin , Xiang Li , Ming Gao

Large Language Models (LLMs) like GPT-4o can help automate text classification tasks at low cost and scale. However, there are major concerns about the validity and reliability of LLM outputs. By contrast, human coding is generally more…

Computation and Language · Computer Science 2025-01-17 Conrad Borchers , Danielle R. Thomas , Jionghao Lin , Ralph Abboud , Kenneth R. Koedinger

Code editing encompasses a variety of pragmatic tasks that developers deal with daily. Despite its relevance and practical usefulness, automatic code editing remains an underexplored area in the evolution of deep learning models, partly due…

Computation and Language · Computer Science 2024-02-29 Kaixin Li , Qisheng Hu , Xu Zhao , Hui Chen , Yuxi Xie , Tiedong Liu , Qizhe Xie , Junxian He

With instruction tuning, Large Language Models (LLMs) can enhance their ability to adhere to commands. Diverging from most works focusing on data mixing, our study concentrates on enhancing the model's capabilities from the perspective of…

Computation and Language · Computer Science 2024-10-07 Jun Rao , Xuebo Liu , Lian Lian , Shengjun Cheng , Yunjie Liao , Min Zhang
‹ Prev 1 2 3 10 Next ›