Related papers: Tree Notation: an antifragile program notation

Unified Abstract Syntax Tree Representation Learning for Cross-Language Program Classification

Program classification can be regarded as a high-level abstraction of code, laying a foundation for various tasks related to source code comprehension, and has a very wide range of applications in the field of software engineering, such as…

Software Engineering · Computer Science 2022-05-03 Kesu Wang , Meng Yan , He Zhang , Haibo Hu

Adding Interactive Visual Syntax to Textual Code

Many programming problems call for turning geometrical thoughts into code: tables, hierarchical structures, nests of objects, trees, forests, graphs, and so on. Linear text does not do justice to such thoughts. But, it has been the dominant…

Programming Languages · Computer Science 2020-10-27 Leif Andersen , Michael Ballantyne , Matthias Felleisen

Bringing Structure to Naturalness: On the Naturalness of ASTs

Source code comes in different shapes and forms. Previous research has already shown code to be more predictable than natural language as well as highlighted its statistical predictability at the token level: source code can be natural.…

Software Engineering · Computer Science 2025-04-14 Profir-Petru Pârţachi , Mahito Sugiyama

Learning to Represent Programs with Heterogeneous Graphs

Program source code contains complex structure information, which can be represented in structured data forms like trees or graphs. To acquire the structural information in source code, most existing researches use abstract syntax trees…

Software Engineering · Computer Science 2022-04-13 Kechi Zhang , Wenhan Wang , Huangzhao Zhang , Ge Li , Zhi Jin

Modular Tree Network for Source Code Representation Learning

Learning representation for source code is a foundation of many program analysis tasks. In recent years, neural networks have already shown success in this area, but most existing models did not make full use of the unique structural…

Software Engineering · Computer Science 2021-04-02 Wenhan Wang , Ge Li , Sijie Shen , Xin Xia , Zhi Jin

xASTNN: Improved Code Representations for Industrial Practice

The application of deep learning techniques in software engineering becomes increasingly popular. One key problem is developing high-quality and easy-to-use source code representations for code-related tasks. The research community has…

Software Engineering · Computer Science 2023-11-07 Zhiwei Xu , Min Zhou , Xibin Zhao , Yang Chen , Xi Cheng , Hongyu Zhang

AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization

Code summarization aims to generate brief natural language descriptions for source code. As source code is highly structured and follows strict programming language grammars, its Abstract Syntax Tree (AST) is often leveraged to inform the…

Computation and Language · Computer Science 2021-12-03 Ze Tang , Chuanyi Li , Jidong Ge , Xiaoyu Shen , Zheling Zhu , Bin Luo

The tree machine

A variant of Turing machines is introduced where the tape is replaced by a single tree which can be manipulated in a style akin to purely functional programming. This yields two benefits: first, the extra structure on the tape can be…

Logic in Computer Science · Computer Science 2015-07-17 Arnaud Spiwack

Exploring Software Naturalness through Neural Language Models

The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing. We explore this hypothesis through the use of a pre-trained transformer-based language…

Computation and Language · Computer Science 2020-06-25 Luca Buratti , Saurabh Pujar , Mihaela Bornea , Scott McCarley , Yunhui Zheng , Gaetano Rossiello , Alessandro Morari , Jim Laredo , Veronika Thost , Yufan Zhuang , Giacomo Domeniconi

Program structure

A program is usually represented as a word chain. It is exactly a word chain that appears as the lexical analyzer output and is parsed. The work shows that a program can be syntactically represented as an oriented word tree, that is a…

Programming Languages · Computer Science 2012-03-23 Alex Shkotin

Abstract Syntax Tree for Programming Language Understanding and Representation: How Far Are We?

Programming language understanding and representation (a.k.a code representation learning) has always been a hot and challenging task in software engineering. It aims to apply deep learning techniques to produce numerical representations of…

Software Engineering · Computer Science 2023-12-04 Weisong Sun , Chunrong Fang , Yun Miao , Yudu You , Mengzhe Yuan , Yuchen Chen , Quanjun Zhang , An Guo , Xiang Chen , Yang Liu , Zhenyu Chen

Varieties of Unranked Tree Languages

We study varieties that contain unranked tree languages over all alphabets. Trees are labeled with symbols from two alphabets, an unranked operator alphabet and an alphabet used for leaves only. Syntactic algebras of unranked tree languages…

Formal Languages and Automata Theory · Computer Science 2015-10-27 Magnus Steinby , Eija Jurvanen , Antonio Cano

Convolutional Neural Networks over Tree Structures for Programming Language Processing

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a…

Machine Learning · Computer Science 2015-12-09 Lili Mou , Ge Li , Lu Zhang , Tao Wang , Zhi Jin

On Tree-Based Neural Sentence Modeling

Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing tree-based encoders adopt syntactic parsing trees as the explicit structure prior. To study the effectiveness of…

Computation and Language · Computer Science 2018-08-30 Haoyue Shi , Hao Zhou , Jiaze Chen , Lei Li

User-friendly explanations for constraint programming

In this paper, we introduce a set of tools for providing user-friendly explanations in an explanation-based constraint programming system. The idea is to represent the constraints of a problem as an hierarchy (a tree). Users are then…

Programming Languages · Computer Science 2007-05-23 Narendra Jussien , Samir Ouis

A General Path-Based Representation for Predicting Program Properties

Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is $\textit{how to…

Programming Languages · Computer Science 2018-04-24 Uri Alon , Meital Zilberstein , Omer Levy , Eran Yahav

TreeBERT: A Tree-Based Pre-Trained Model for Programming Language

Source code can be parsed into the abstract syntax tree (AST) based on defined syntax rules. However, in pre-training, little work has considered the incorporation of tree structure into the learning process. In this paper, we present…

Machine Learning · Computer Science 2021-07-16 Xue Jiang , Zhuoran Zheng , Chen Lyu , Liang Li , Lei Lyu

Towards an extension of Fault Trees in the Predictive Maintenance Scenario

One of the most appreciated features of Fault Trees (FTs) is their simplicity, making them fit into industrial processes. As such processes evolve in time, considering new aspects of large modern systems, modelling techniques based on FTs…

Machine Learning · Computer Science 2024-03-21 Roberta De Fazio , Stefano Marrone , Laura Verde , Vincenzo Reccia , Paolo Valletta

CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

Code summarization aims to generate concise natural language descriptions of source code, which can help improve program comprehension and maintenance. Recent studies show that syntactic and structural information extracted from abstract…

Software Engineering · Computer Science 2021-12-01 Ensheng Shi , Yanlin Wang , Lun Du , Hongyu Zhang , Shi Han , Dongmei Zhang , Hongbin Sun

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing…

Machine Learning · Computer Science 2019-10-29 Vinoj Jayasundara , Nghi Duy Quoc Bui , Lingxiao Jiang , David Lo