English
Related papers

Related papers: Structural Language Models of Code

200 papers

We address the problem of predicting edit completions based on a learned model that was trained on past edits. Given a code snippet that is partially edited, our goal is to predict a completion of the edit for the rest of the snippet. We…

Programming Languages · Computer Science 2020-10-13 Shaked Brody , Uri Alon , Eran Yahav

Language models such as RNN, LSTM or other variants have been widely used as generative models in natural language processing. In last few years, taking source code as natural languages, parsing source code into a token sequence and using a…

Software Engineering · Computer Science 2019-10-28 Yixiao Yang

Neural machine translation models are used to automatically generate a document from given source code since this can be regarded as a machine translation task. Source code summarization is one of the components for automatic document…

Machine Learning · Computer Science 2019-06-24 Yusuke Shido , Yasuaki Kobayashi , Akihiro Yamamoto , Atsushi Miyamoto , Tadayuki Matsumura

The recent advancements of Small Language Models (SLMs) have opened new possibilities for efficient code generation. SLMs offer lightweight and cost-effective alternatives to Large Language Models (LLMs), making them attractive for use in…

Software Engineering · Computer Science 2026-01-21 Md Mahade Hasan , Muhammad Waseem , Kai-Kristian Kemell , Jussi Rasku , Juha Ala-Rantala , Pekka Abrahamsson

Source code comes in different shapes and forms. Previous research has already shown code to be more predictable than natural language as well as highlighted its statistical predictability at the token level: source code can be natural.…

Software Engineering · Computer Science 2025-04-14 Profir-Petru Pârţachi , Mahito Sugiyama

Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance.…

Software Engineering · Computer Science 2019-02-07 Alexander LeClair , Siyuan Jiang , Collin McMillan

Modern generative pre-trained language models excel at open-ended text generation, yet continue to underperform on structure-related tasks such as NER, relation extraction, and semantic role labeling, especially when compared to…

Computation and Language · Computer Science 2025-12-23 Minho Lee , Junghyun Min , Yerang Kim , Woochul Lee , Yeonsoo Lee

The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval. Sequence-to-sequence (seq2seq) models, adopted from neural machine…

Machine Learning · Computer Science 2019-02-22 Uri Alon , Shaked Brody , Omer Levy , Eran Yahav

Language models generate reasoning sequentially, preventing them from decoupling irrelevant exploration paths during search. We introduce Tree-Structured Language Modeling (TSLM), which uses special tokens to encode branching structure,…

Computation and Language · Computer Science 2026-02-02 Doyoung Kim , Jaehyeok Doo , Minjoon Seo

Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is crucial for advancing drug discovery. Traditional physics-based simulation methods often struggle with…

Biomolecules · Quantitative Biology 2025-03-14 Jiarui Lu , Xiaoyin Chen , Stephen Zhewen Lu , Chence Shi , Hongyu Guo , Yoshua Bengio , Jian Tang

Program synthesis from natural language (NL) is practical for humans and, once technically feasible, would significantly facilitate software development and revolutionize end-user programming. We present SAPS, an end-to-end neural network…

Machine Learning · Computer Science 2019-02-19 Jakub Bednarek , Karol Piaskowski , Krzysztof Krawiec

In almost all text generation applications, word sequences are constructed in a left-to-right (L2R) or right-to-left (R2L) manner, as natural language sentences are written either L2R or R2L. However, we find that the natural language…

Computation and Language · Computer Science 2021-12-21 Yong Cao , Yukun Feng , Shaohui Kuang , Gu Xu

Large language models (LLMs) have shown remarkable ability to generate code, yet their outputs often violate syntactic or semantic constraints when guided only through natural language prompts. We introduce TreeCoder, the most general and…

Machine Learning · Computer Science 2026-04-27 Henrijs Princis , Arindam Sharma , Cristina David

Code translation migrates codebases across programming languages. Recently, large language models (LLMs) have achieved significant advancements in software mining. However, handling the syntactic structure of source code remains a…

Software Engineering · Computer Science 2025-10-14 Yali Du , Hui Sun , Ming Li

A code completion system suggests future code elements to developers given a partially-complete code snippet. Code completion is one of the most useful features in Integrated Development Environments (IDEs). Currently, most code completion…

Software Engineering · Computer Science 2020-09-21 Wenhan Wang , Sijie Shen , Ge Li , Zhi Jin

Assessing the stability of code generation from large language models (LLMs) is essential for judging their reliability in real-world development. We extend prior "structural-entropy concepts" to the program domain by pairing entropy with…

Software Engineering · Computer Science 2025-08-21 Yewei Song , Tiezhu Sun , Xunzhu Tang , Prateek Rajput , Tegawende F. Bissyande , Jacques Klein

Searching code is a common task that developers perform to understand APIs, learn common code patterns, and navigate code. Currently, developers most commonly search using keywords and regular expressions that are easy to use and widely…

Software Engineering · Computer Science 2025-07-04 Ben Limpanukorn , Yanjun Wang , Zach Patterson , Pranav Garg , Murali Krishna Ramanathan , Xiaofei Ma , Anoop Deoras , Miryung Kim

This work introduces (1) a technique that allows large language models (LLMs) to leverage user-provided code when solving programming tasks and (2) a method to iteratively generate modular sub-functions that can aid future code generation…

Machine Learning · Computer Science 2023-12-05 Patrick Hajali , Ignas Budvytis

Automated molecular structure elucidation remains challenging, as existing approaches often depend on pre-compiled databases or restrict themselves to single spectroscopic modalities. Here we introduce SpectraLLM, a large language model…

Quantitative Methods · Quantitative Biology 2026-05-12 Yunyue Su , Jiahui Chen , Zao Jiang , Zhenyi Zhong , Liang Wang , Qiang Liu , Zhaoxiang Zhang

In recent years, Large Language Models (LLMs) have achieved remarkable progress in automated code generation. In real-world software engineering, the growing demand for rapid iteration and continuous delivery underscores the importance of…

Software Engineering · Computer Science 2025-11-06 Qianhui Zhao , Li Zhang , Fang Liu , Junhang Cheng , Chengru Wu , Junchen Ai , Qiaoyuanhe Meng , Lichen Zhang , Xiaoli Lian , Shubin Song , Yuanping Guo
‹ Prev 1 2 3 10 Next ›