English
Related papers

Related papers: Learning Program Semantics with Code Representatio…

200 papers

Program representation learning is a fundamental task in software engineering applications. With the availability of "big code" and the development of deep learning techniques, various program representation learning models have been…

Software Engineering · Computer Science 2021-09-17 Siqi Han , DongXia Wang , Wanting Li , Xuesong Lu

Machine learning techniques for cybersecurity-related software engineering tasks are becoming increasingly popular. The representation of source code is a key portion of the technique that can impact the way the model is able to learn the…

Machine Learning · Computer Science 2025-04-10 Beatrice Casey , Joanna C. S. Santos , George Perry

Efficiently representing source code is crucial for various software engineering tasks such as code classification and clone detection. Existing approaches primarily use Abstract Syntax Tree (AST), and only a few focus on semantic graphs…

Software Engineering · Computer Science 2023-12-27 Karthik Chandra Swarna , Noble Saji Mathews , Dheeraj Vagavolu , Sridhar Chimalakonda

Software comprehension can be extremely time-consuming due to the ever-growing size of codebases. Consequently, there is an increasing need to accelerate the code comprehension process to facilitate maintenance and reduce associated costs.…

Software Engineering · Computer Science 2024-01-15 Krzysztof Borowski , Bartosz Baliś , Tomasz Orzechowski

Learning from source code usually requires a large amount of labeled data. Despite the possible scarcity of labeled data, the trained model is highly task-specific and lacks transferability to different tasks. In this work, we present…

Machine Learning · Computer Science 2021-03-05 Linfeng Liu , Hoan Nguyen , George Karypis , Srinivasan Sengamedu

Code clones are pairs of code snippets that implement similar functionality. Clone detection is a fundamental branch of automatic source code comprehension, having many applications in refactoring recommendation, plagiarism detection, and…

Software Engineering · Computer Science 2022-06-20 Maksim Zubkov , Egor Spirin , Egor Bogomolov , Timofey Bryksin

With the recent success of embeddings in natural language processing, research has been conducted into applying similar methods to code analysis. Most works attempt to process the code directly or use a syntactic tree representation,…

Machine Learning · Computer Science 2018-11-30 Tal Ben-Nun , Alice Shoshana Jakobovits , Torsten Hoefler

The dominant paradigm for semantic parsing in recent years is to formulate parsing as a sequence-to-sequence task, generating predictions with auto-regressive sequence decoders. In this work, we explore an alternative paradigm. We formulate…

Computation and Language · Computer Science 2023-03-24 Jeremy R. Cole , Nanjiang Jiang , Panupong Pasupat , Luheng He , Peter Shaw

Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing…

Machine Learning · Computer Science 2019-10-29 Vinoj Jayasundara , Nghi Duy Quoc Bui , Lingxiao Jiang , David Lo

Programming language understanding and representation (a.k.a code representation learning) has always been a hot and challenging task in software engineering. It aims to apply deep learning techniques to produce numerical representations of…

Software Engineering · Computer Science 2023-12-04 Weisong Sun , Chunrong Fang , Yun Miao , Yudu You , Mengzhe Yuan , Yuchen Chen , Quanjun Zhang , An Guo , Xiang Chen , Yang Liu , Zhenyu Chen

Recent successes in training word embeddings for NLP tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that…

Software Engineering · Computer Science 2020-02-10 Patrick Keller , Laura Plein , Tegawendé F. Bissyandé , Jacques Klein , Yves Le Traon

Neural program embeddings have shown much promise recently for a variety of program analysis tasks, including program synthesis, program repair, fault localization, etc. However, most existing program embeddings are based on syntactic…

Artificial Intelligence · Computer Science 2018-07-03 Ke Wang , Rishabh Singh , Zhendong Su

Neural approaches to program synthesis and understanding have proliferated widely in the last few years; at the same time graph based neural networks have become a promising new tool. This work aims to be the first empirical study comparing…

Software Engineering · Computer Science 2020-01-28 Austin P. Wright , Herbert Wiklicky

Program source code contains complex structure information, which can be represented in structured data forms like trees or graphs. To acquire the structural information in source code, most existing researches use abstract syntax trees…

Software Engineering · Computer Science 2022-04-13 Kechi Zhang , Wenhan Wang , Huangzhao Zhang , Ge Li , Zhi Jin

Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is $\textit{how to…

Programming Languages · Computer Science 2018-04-24 Uri Alon , Meital Zilberstein , Omer Levy , Eran Yahav

Semantic clones are program components with similar behavior, but different textual representation. Semantic similarity is hard to detect, and semantic clone detection is still an open issue. We present semantic clone detection via…

Software Engineering · Computer Science 2020-01-22 Hannes Thaller , Lukas Linsbauer , Alexander Egyed

Modern software systems are developed in diverse programming languages and often harbor critical vulnerabilities that attackers can exploit to compromise security. These vulnerabilities have been actively targeted in real-world attacks,…

Cryptography and Security · Computer Science 2025-03-27 Zhuoyun Qian , Fangtian Zhong , Qin Hu , Yili Jiang , Jiaqi Huang , Mengfei Ren , Jiguo Yu

Recently program learning techniques have been proposed to process source code based on syntactical structures (e.g., Abstract Syntax Trees) and/or semantic information (e.g., Dependency Graphs). Although graphs may be better at capturing…

Software Engineering · Computer Science 2020-12-15 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

We present evidence that language models (LMs) of code can learn to represent the formal semantics of programs, despite being trained only to perform next-token prediction. Specifically, we train a Transformer model on a synthetic corpus of…

Machine Learning · Computer Science 2024-08-06 Charles Jin , Martin Rinard

The remarkable growth and significant success of machine learning have expanded its applications into programming languages and program analysis. However, a key challenge in adopting the latest machine learning methods is the representation…

Programming Languages · Computer Science 2023-12-01 Ali TehraniJamsaz , Quazi Ishtiaque Mahmud , Le Chen , Nesreen K. Ahmed , Ali Jannesari
‹ Prev 1 2 3 10 Next ›