Related papers: Technical Report: Towards a Universal Code Formatt…

A Tool for Model-Based Language Specification

Formal languages let us define the textual representation of data with precision. Formal grammars, typically in the form of BNF-like productions, describe the language syntax, which is then annotated for syntax-directed translation and…

Software Engineering · Computer Science 2015-03-19 Luis Quesada , Fernando Berzal , Juan-Carlos Cubero

CodeTF: One-stop Transformer Library for State-of-the-art Code LLMs

Code intelligence plays a key role in transforming modern software engineering. Recently, deep learning-based models, especially Transformer-based large language models (LLMs), have demonstrated remarkable potential in tackling these tasks…

Software Engineering · Computer Science 2025-12-23 Nghi D. Q. Bui , Hung Le , Yue Wang , Junnan Li , Akhilesh Deepak Gotmare , Steven C. H. Hoi

ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models

Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools. Existing approaches face significant challenges, including reliance on…

Computation and Language · Computer Science 2025-06-02 Hanxing Ding , Shuchang Tao , Liang Pang , Zihao Wei , Jinyang Gao , Bolin Ding , Huawei Shen , Xueqi Cheng

Towards Robust Blind Face Restoration with Codebook Lookup Transformer

Blind face restoration is a highly ill-posed problem that often requires auxiliary guidance to 1) improve the mapping from degraded inputs to desired outputs, or 2) complement high-quality details lost in the inputs. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2022-11-02 Shangchen Zhou , Kelvin C. K. Chan , Chongyi Li , Chen Change Loy

Applying CodeBERT for Automated Program Repair of Java Simple Bugs

Software debugging, and program repair are among the most time-consuming and labor-intensive tasks in software engineering that would benefit a lot from automation. In this paper, we propose a novel automated program repair approach based…

Software Engineering · Computer Science 2021-04-01 Ehsan Mashhadi , Hadi Hemmati

Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format

Generating and voting multiple answers is an effective method to mitigate reasoning inconsistencies of large language models (LLMs). Prior works have shown that multiple reasoning formats outperform a single format when generating multiple…

Computation and Language · Computer Science 2025-07-01 Dingzirui Wang , Xuanliang Zhang , Rongyu Cao , Longxu Dou , Xianzhen Luo , Yingwei Ma , Qingfu Zhu , Wanxiang Che , Binhua Li , Fei Huang , Yongbin Li

A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text

Java Code Generation consists in generating automatically Java code from a Natural Language Text. This NLP task helps in increasing programmers' productivity by providing them with immediate solutions to the simplest and most repetitive…

Computation and Language · Computer Science 2023-06-13 Jessica López Espejel , Mahaman Sanoussi Yahaya Alassan , El Mehdi Chouham , Walid Dahhane , El Hassane Ettifouri

TF-Coder: Program Synthesis for Tensor Manipulations

The success and popularity of deep learning is on the rise, partially due to powerful deep learning frameworks such as TensorFlow and PyTorch that make it easier to develop deep learning models. However, these libraries also come with steep…

Programming Languages · Computer Science 2022-04-11 Kensen Shi , David Bieber , Rishabh Singh

CodeLabeller: A Web-based Code Annotation Tool for Java Design Patterns and Summaries

While constructing supervised learning models, we require labelled examples to build a corpus and train a machine learning model. However, most studies have built the labelled dataset manually, which in many occasions is a daunting task. To…

Software Engineering · Computer Science 2023-03-14 Najam Nazar , Norman Chen , Chun Yong Chong

Towards Code Generation from BDD Test Case Specifications: A Vision

Automatic code generation has recently attracted large attention and is becoming more significant to the software development process. Solutions based on Machine Learning and Artificial Intelligence are being used to increase human and…

Software Engineering · Computer Science 2023-05-22 Leon Chemnitz , David Reichenbach , Hani Aldebes , Mariam Naveed , Krishna Narasimhan , Mira Mezini

Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code

Few-shot learning with large-scale, pre-trained language models is a powerful way to answer questions about code, e.g., how to complete a given code example, or even generate code snippets from scratch. The success of these models raises…

Software Engineering · Computer Science 2022-06-14 Patrick Bareiß , Beatriz Souza , Marcelo d'Amorim , Michael Pradel

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Imagine a developer who can only change their last line of code, how often would they have to start writing a function from scratch before it is correct? Auto-regressive models for code generation from natural language have a similar…

Software Engineering · Computer Science 2023-11-02 Mukul Singh , José Cambronero , Sumit Gulwani , Vu Le , Carina Negreanu , Gust Verbruggen

Large Language Models for Compiler Optimization

We explore the novel application of Large Language Models to code optimization. We present a 7B-parameter transformer model trained from scratch to optimize LLVM assembly for code size. The model takes as input unoptimized assembly and…

Programming Languages · Computer Science 2023-09-14 Chris Cummins , Volker Seeker , Dejan Grubisic , Mostafa Elhoushi , Youwei Liang , Baptiste Roziere , Jonas Gehring , Fabian Gloeckle , Kim Hazelwood , Gabriel Synnaeve , Hugh Leather

Formal Fields: A Framework to Automate Code Generation Across Domains

Code generation, defined as automatically writing a piece of code to solve a given problem for which an evaluation function exists, is a classic hard AI problem. Its general form, writing code using a general language used by human…

Artificial Intelligence · Computer Science 2020-07-29 Jacques Basaldúa

ComFormer: Code Comment Generation via Transformer and Fusion Method-based Hybrid Code Representation

Developers often write low-quality code comments due to the lack of programming experience, which can reduce the efficiency of developers program comprehension. Therefore, developers hope that code comment generation tools can be developed…

Software Engineering · Computer Science 2021-07-09 Guang Yang , Xiang Chen , Jinxin Cao , Shuyuan Xu , Zhanqi Cui , Chi Yu , Ke Liu

Type-Constrained Code Generation with Language Models

Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although…

Machine Learning · Computer Science 2025-05-09 Niels Mündler , Jingxuan He , Hao Wang , Koushik Sen , Dawn Song , Martin Vechev

Fault-Aware Neural Code Rankers

Large language models (LLMs) have demonstrated an impressive ability to generate code for various programming tasks. In many instances, LLMs can generate a correct program for a task when given numerous trials. Consequently, a recent trend…

Programming Languages · Computer Science 2022-12-13 Jeevana Priya Inala , Chenglong Wang , Mei Yang , Andres Codas , Mark Encarnación , Shuvendu K Lahiri , Madanlal Musuvathi , Jianfeng Gao

Machine learning approach of Japanese composition scoring and writing aided system's design

Automatic scoring system is extremely complex for any language. Because natural language itself is a complex model. When we evaluate articles generated by natural language, we need to view the articles from many dimensions such as word…

Computation and Language · Computer Science 2020-08-27 Wanhong Huang

CodeMapper: A Language-Agnostic Approach to Mapping Code Regions Across Commits

During software evolution, developers commonly face the problem of mapping a specific code region from one commit to another. For example, they may want to determine how the condition of an if-statement, a specific line in a configuration…

Software Engineering · Computer Science 2025-11-10 Huimin Hu , Michael Pradel

The ModelCC Model-Based Parser Generator

Formal languages let us define the textual representation of data with precision. Formal grammars, typically in the form of BNF-like productions, describe the language syntax, which is then annotated for syntax-directed translation and…

Formal Languages and Automata Theory · Computer Science 2015-01-15 Luis Quesada , Fernando Berzal , Juan-Carlos Cubero