English
Related papers

Related papers: DocCGen: Document-based Controlled Code Generation

200 papers

Large language models (LLMs) can be used to support software development tasks, e.g., through code completion or code generation. However, their effectiveness drops significantly when considering less popular programming languages such as…

Software Engineering · Computer Science 2026-03-06 David Delgado , Lola Burgueño , Robert Clarisó

Recent work shows Large Language Models (LLMs) struggle to understand natural language constraints for various text generation tasks in zero- and few-shot settings. While, in the code domain, there is wide usage of constraints in code…

Software Engineering · Computer Science 2025-03-25 Mehant Kammakomati , Sameer Pimparkhede , Srikanth Tamilselvam , Prince Kumar , Pushpak Bhattacharyya

Large Language Models (LLMs) have shown impressive capabilities in code generation for popular programming languages. However, their performance on Low-Resource Programming Languages (LRPLs) and Domain-Specific Languages (DSLs) remains a…

Software Engineering · Computer Science 2025-09-29 Sathvik Joel , Jie JW Wu , Fatemeh H. Fard

Large language models (LLMs) are changing the way researchers interact with code and data in scientific computing. While their ability to generate general-purpose code is well established, their effectiveness in producing scientifically…

Software Engineering · Computer Science 2026-05-25 Ethan Holbrook , Juan C. Verduzco , Alejandro Strachan

Large language models (LLMs) such as ChatGPT have shown remarkable capabilities in code generation. Despite significant achievements, they rely on enormous training data to acquire a broad spectrum of open-domain knowledge. Besides, their…

Software Engineering · Computer Science 2025-02-18 Xiaodong Gu , Meng Chen , Yalan Lin , Yuhan Hu , Hongyu Zhang , Chengcheng Wan , Zhao Wei , Yong Xu , Juhong Wang

Large language models (LLMs) have revolutionized code generation, automating programming with remarkable efficiency. However, these advancements challenge programming skills, ethics, and assessment integrity, making the detection of…

Computation and Language · Computer Science 2025-07-18 Daniil Orel , Dilshod Azizov , Preslav Nakov

We propose a method to guide Large Language Models (LLMs) in generating structured content adhering to specific conventions without fine-tuning. By utilizing coroutine-based content generation constraints through a pre-agreed context-free…

Software Engineering · Computer Science 2024-04-23 Jiaye Wang

Publicly available source-code libraries are continuously growing and changing. This makes it impossible for models of code to keep current with all available APIs by simply training these models on existing code repositories. Thus,…

Computation and Language · Computer Science 2023-02-21 Shuyan Zhou , Uri Alon , Frank F. Xu , Zhiruo Wang , Zhengbao Jiang , Graham Neubig

While large language models (LLMs) have been widely applied to code generation, they struggle with generating entire deep learning projects, which are characterized by complex structures, longer functions, and stronger reliance on domain…

Software Engineering · Computer Science 2025-04-22 Chen Xie , Mingsheng Jiao , Xiaodong Gu , Beijun Shen

Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This…

Computation and Language · Computer Science 2025-10-28 Juyong Jiang , Fan Wang , Jiasi Shen , Sungju Kim , Sunghun Kim

Large Language Models (LLMs) such as GPT-4 and Llama3 have significantly impacted various fields by enabling high-quality synthetic data generation and reducing dependence on expensive human-generated datasets. Despite this, challenges…

Computation and Language · Computer Science 2025-11-18 Yue Huang , Siyuan Wu , Chujie Gao , Dongping Chen , Qihui Zhang , Yao Wan , Tianyi Zhou , Jianfeng Gao , Chaowei Xiao , Lichao Sun , Xiangliang Zhang

Large language models (LLMs) are increasingly used to generate executable outputs, JSON objects, and API calls, where a single syntax error can make the output unusable. Constrained decoding enforces validity token-by-token via masking and…

Computation and Language · Computer Science 2026-03-05 Avinash Reddy , Thayne T. Walker , James S. Ide , Amrit Singh Bedi

The understanding of large-scale scientific software poses significant challenges due to its diverse codebase, extensive code length, and target computing architectures. The emergence of generative AI, specifically large language models…

Software Engineering · Computer Science 2024-03-19 Kareem Shaik , Dali Wang , Weijian Zheng , Qinglei Cao , Heng Fan , Peter Schwartz , Yunhe Feng

Large language models (LLMs) perform strongly on general-purpose code generation, yet their applicability to enterprise domain-specific languages (DSLs) remains underexplored, especially for repository-scale change generation spanning…

Software Engineering · Computer Science 2026-04-28 Sivajeet Chand , Kevin Nguyen , Peter Kuntz , Alexander Pretschner

Large language models (LLMs) have brought significant advancements to code generation, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like GitHub, introduces…

Software Engineering · Computer Science 2023-10-26 Jiexin Wang , Liuwen Cao , Xitong Luo , Zhiping Zhou , Jiayuan Xie , Adam Jatowt , Yi Cai

Containerization allows developers to define the execution environment in which their software needs to be installed. Docker is the leading platform in this field, and developers that use it are required to write a Dockerfile for their…

Software Engineering · Computer Science 2023-03-29 Giovanni Rosa , Antonio Mastropaolo , Simone Scalabrino , Gabriele Bavota , Rocco Oliveto

Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although…

Machine Learning · Computer Science 2025-05-09 Niels Mündler , Jingxuan He , Hao Wang , Koushik Sen , Dawn Song , Martin Vechev

Domain-specific languages (DSLs) play an increasingly important role in the generation of high performing software. They allow the user to exploit specific knowledge encoded in the constructs for the generation of code adapted to a…

Mathematical Software · Computer Science 2019-05-08 Pramod Kumbhar , Omar Awile , Liam Keegan , Jorge Blanco Alonso , James King , Michael Hines , Felix Schürmann

Large Language Models (LLMs) have revolutionised the field of Natural Language Processing (NLP) and have achieved state-of-the-art performance in practically every task in this field. However, the prevalent approach used in text generation,…

Computation and Language · Computer Science 2024-08-12 Nicolo Micheletti , Samuel Belkadi , Lifeng Han , Goran Nenadic

Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks, yet they face significant limitations in handling complex, long-context programming challenges and demonstrating complex compositional reasoning…

Artificial Intelligence · Computer Science 2025-01-14 Amr Almorsi , Mohanned Ahmed , Walid Gomaa
‹ Prev 1 2 3 10 Next ›