English
Related papers

Related papers: GramTrans: A Better Code Representation Approach i…

200 papers

Grammar serves as a cornerstone in programming languages and software engineering, providing frameworks to define the syntactic space and program structure. Existing research demonstrates the effectiveness of grammar-based code…

Programming Languages · Computer Science 2025-12-11 Qingyuan Liang , Zhao Zhang , Zeyu Sun , Zheng Lin , Qi Luo , Yueyi Xiao , Yizhou Chen , Yuqun Zhang , Haotian Zhang , Lu Zhang , Bin Chen , Yingfei Xiong

Representation learning of source code is essential for applying machine learning to software engineering tasks. Learning code representation from a multilingual source code dataset has been shown to be more effective than learning from…

Software Engineering · Computer Science 2022-12-07 Weiguo Pian , Hanyu Peng , Xunzhu Tang , Tiezhu Sun , Haoye Tian , Andrew Habib , Jacques Klein , Tegawendé F. Bissyandé

Code transformation is a foundational capability in the software development process, where its effectiveness relies on constructing a high-quality code representation to characterize the input code semantics and guide the transformation.…

Machine Learning · Computer Science 2026-03-17 Weichen Li , Jiamin Song , Bogdan Alexandru Stoica , Arav Dhoot , Gabriel Ryan , Shengyu Fu , Kexin Pei

Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although…

Machine Learning · Computer Science 2025-05-09 Niels Mündler , Jingxuan He , Hao Wang , Koushik Sen , Dawn Song , Martin Vechev

Deep learning is widely used to uncover hidden patterns in large code corpora. To achieve this, constructing a format that captures the relevant characteristics and features of source code is essential. Graph-based representations have…

Software Engineering · Computer Science 2024-02-01 Mootez Saad , Tushar Sharma

Code summarization and generation empower conversion between programming language (PL) and natural language (NL), while code translation avails the migration of legacy code from one PL to another. This paper introduces PLBART, a…

Computation and Language · Computer Science 2021-04-13 Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , Kai-Wei Chang

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred…

Computation and Language · Computer Science 2024-02-06 Dejiao Zhang , Wasi Ahmad , Ming Tan , Hantian Ding , Ramesh Nallapati , Dan Roth , Xiaofei Ma , Bing Xiang

Since the introduction of Large Language Models (LLMs), they have been widely adopted for various tasks such as text summarization, question answering, speech-to-text translation, and more. In recent times, the use of LLMs for code…

Software Engineering · Computer Science 2026-01-22 Krishna Vamshi Bodla , Haizhao Yang

Code search is a widely used technique by developers during software development. It provides semantically similar implementations from a large code corpus to developers based on their queries. Existing techniques leverage deep learning…

Software Engineering · Computer Science 2022-02-17 Weisong Sun , Chunrong Fang , Yuchen Chen , Guanhong Tao , Tingxu Han , Quanjun Zhang

Most large language models (LLMs) today excel at generating raw, sequential code with minimal abstractions and custom structures. However, there has been little work on graph-based abstract code generation, where significant logic is…

Computation and Language · Computer Science 2025-10-16 Nyx Iskandar , Hisham Bedri , Andy Tsen

Formal languages let us define the textual representation of data with precision. Formal grammars, typically in the form of BNF-like productions, describe the language syntax, which is then annotated for syntax-directed translation and…

Formal Languages and Automata Theory · Computer Science 2015-01-15 Luis Quesada , Fernando Berzal , Juan-Carlos Cubero

Programming is a powerful and ubiquitous problem-solving tool. Developing systems that can assist programmers or even generate programs independently could make programming more productive and accessible, yet so far incorporating…

Large pre-trained language models have recently been expanded and applied to programming language tasks with great success, often through further pre-training of a strictly-natural language model--where training sequences typically contain…

Computation and Language · Computer Science 2024-02-13 Fenia Christopoulou , Guchun Zhang , Gerasimos Lampouras

Large Language Models (LLMs) have demonstrated great promise in generating code, especially when used inside an evolutionary computation framework to iteratively optimize the generated algorithms. However, in some cases they fail to…

Neural and Evolutionary Computing · Computer Science 2025-03-24 Niki van Stein , Anna V. Kononova , Lars Kotthoff , Thomas Bäck

Large language models have gained significant traction and popularity in recent times, extending their usage to code-generation tasks. While this field has garnered considerable attention, the exploration of testing and evaluating the…

Software Engineering · Computer Science 2026-05-05 Fazle Rabbi , Zishuo Ding , Jinqiu Yang

Program synthesis or code generation aims to generate a program that satisfies a problem specification. Recent approaches using large-scale pretrained language models (LMs) have shown promising results, yet they have some critical…

Machine Learning · Computer Science 2022-11-04 Hung Le , Yue Wang , Akhilesh Deepak Gotmare , Silvio Savarese , Steven C. H. Hoi

Currently, a growing number of mature natural language processing applications make people's life more convenient. Such applications are built by source code - the language in software engineering. However, the applications for…

Software Engineering · Computer Science 2021-05-13 Ahmed Elnaggar , Wei Ding , Llion Jones , Tom Gibbs , Tamas Feher , Christoph Angerer , Silvia Severini , Florian Matthes , Burkhard Rost

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate…

Software Engineering · Computer Science 2024-03-21 Zhihong Sun , Chen Lyu , Bolun Li , Yao Wan , Hongyu Zhang , Ge Li , Zhi Jin

Deep learning methods, which have found successful applications in fields like image classification and natural language processing, have recently been applied to source code analysis too, due to the enormous amount of freely available…

Software Engineering · Computer Science 2021-11-18 Rocìo Cabrera Lozoya , Arnaud Baumann , Antonino Sabetta , Michele Bezzi
‹ Prev 1 2 3 10 Next ›