English
Related papers

Related papers: Code Representation Learning with Pr\"ufer Sequenc…

200 papers

Programming language understanding and representation (a.k.a code representation learning) has always been a hot and challenging task in software engineering. It aims to apply deep learning techniques to produce numerical representations of…

Software Engineering · Computer Science 2023-12-04 Weisong Sun , Chunrong Fang , Yun Miao , Yudu You , Mengzhe Yuan , Yuchen Chen , Quanjun Zhang , An Guo , Xiang Chen , Yang Liu , Zhenyu Chen

Learning vector representations for programs is a critical step in applying deep learning techniques for program understanding tasks. Various neural network models are proposed to learn from tree-structured program representations, e.g.,…

Software Engineering · Computer Science 2023-01-10 Wenhan Wang , Kechi Zhang , Ge Li , Shangqing Liu , Anran Li , Zhi Jin , Yang Liu

Code summarization aims to generate concise natural language descriptions of source code, which can help improve program comprehension and maintenance. Recent studies show that syntactic and structural information extracted from abstract…

Software Engineering · Computer Science 2021-12-01 Ensheng Shi , Yanlin Wang , Lun Du , Hongyu Zhang , Shi Han , Dongmei Zhang , Hongbin Sun

Program classification can be regarded as a high-level abstraction of code, laying a foundation for various tasks related to source code comprehension, and has a very wide range of applications in the field of software engineering, such as…

Software Engineering · Computer Science 2022-05-03 Kesu Wang , Meng Yan , He Zhang , Haibo Hu

Current language models tailored for code tasks often adopt the pre-training-then-fine-tuning paradigm from natural language processing, modeling source code as plain text. This approach, however, overlooks the unambiguous structures…

Computation and Language · Computer Science 2024-01-22 Mayank Agarwal , Yikang Shen , Bailin Wang , Yoon Kim , Jie Chen

Code summarization aims to generate brief natural language descriptions for source code. As source code is highly structured and follows strict programming language grammars, its Abstract Syntax Tree (AST) is often leveraged to inform the…

Computation and Language · Computer Science 2021-12-03 Ze Tang , Chuanyi Li , Jidong Ge , Xiaoyu Shen , Zheling Zhu , Bin Luo

The application of deep learning techniques in software engineering becomes increasingly popular. One key problem is developing high-quality and easy-to-use source code representations for code-related tasks. The research community has…

Software Engineering · Computer Science 2023-11-07 Zhiwei Xu , Min Zhou , Xibin Zhao , Yang Chen , Xi Cheng , Hongyu Zhang

Program representation learning is a fundamental task in software engineering applications. With the availability of "big code" and the development of deep learning techniques, various program representation learning models have been…

Software Engineering · Computer Science 2021-09-17 Siqi Han , DongXia Wang , Wanting Li , Xuesong Lu

Program source code contains complex structure information, which can be represented in structured data forms like trees or graphs. To acquire the structural information in source code, most existing researches use abstract syntax trees…

Software Engineering · Computer Science 2022-04-13 Kechi Zhang , Wenhan Wang , Huangzhao Zhang , Ge Li , Zhi Jin

Performance analysis has always been an afterthought during the application development process, focusing on application correctness first. The learning curve of the existing static and dynamic analysis tools are steep, which requires…

Machine Learning · Computer Science 2021-04-23 Nathan Pinnow , Tarek Ramadan , Tanzima Z. Islam , Chase Phelps , Jayaraman J. Thiagarajan

Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is $\textit{how to…

Programming Languages · Computer Science 2018-04-24 Uri Alon , Meital Zilberstein , Omer Levy , Eran Yahav

The objective of pre-trained language models is to learn contextual representations of textual data. Pre-trained language models have become mainstream in natural language processing and code modeling. Using probes, a technique to study the…

Computation and Language · Computer Science 2022-09-13 José Antonio Hernández López , Martin Weyssow , Jesús Sánchez Cuadrado , Houari Sahraoui

Learning representation for source code is a foundation of many program analysis tasks. In recent years, neural networks have already shown success in this area, but most existing models did not make full use of the unique structural…

Software Engineering · Computer Science 2021-04-02 Wenhan Wang , Ge Li , Sijie Shen , Xin Xia , Zhi Jin

Transformer-based models have demonstrated significant success in various source code representation tasks. Nonetheless, traditional positional embeddings employed by these models inadequately capture the hierarchical structure intrinsic to…

Machine Learning · Computer Science 2025-07-08 Patryk Bartkowiak , Filip Graliński

Source code (Context) and its parsed abstract syntax tree (AST; Structure) are two complementary representations of the same computer program. Traditionally, designers of machine learning models have relied predominantly either on Structure…

Machine Learning · Computer Science 2021-03-23 Daniel Zügner , Tobias Kirschstein , Michele Catasta , Jure Leskovec , Stephan Günnemann

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code…

Deep learning techniques applied to program analysis tasks such as code classification, summarization, and bug detection have seen widespread interest. Traditional approaches, however, treat programming source code as natural language text,…

Software Engineering · Computer Science 2024-02-16 Xueting Guan , Christoph Treude

Efficiently representing source code is crucial for various software engineering tasks such as code classification and clone detection. Existing approaches primarily use Abstract Syntax Tree (AST), and only a few focus on semantic graphs…

Software Engineering · Computer Science 2023-12-27 Karthik Chandra Swarna , Noble Saji Mathews , Dheeraj Vagavolu , Sridhar Chimalakonda

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence. Recently, many pre-trained language models for…

Computation and Language · Computer Science 2021-09-10 Xin Wang , Yasheng Wang , Fei Mi , Pingyi Zhou , Yao Wan , Xiao Liu , Li Li , Hao Wu , Jin Liu , Xin Jiang

Source code summarization aims at generating concise and clear natural language descriptions for programming languages. Well-written code summaries are beneficial for programmers to participate in the software development and maintenance…

Computation and Language · Computer Science 2022-02-15 Zi Gong , Cuiyun Gao , Yasheng Wang , Wenchao Gu , Yun Peng , Zenglin Xu
‹ Prev 1 2 3 10 Next ›