Related papers: StructCoder: Structure-Aware Transformer for Code …

ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema. To ensure the well-formedness of output SQLs, one prominent approach adopts a grammar-based recurrent decoder to produce…

Computation and Language · Computer Science 2023-10-31 Ruisheng Cao , Hanchong Zhang , Hongshen Xu , Jieyu Li , Da Ma , Lu Chen , Kai Yu

SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations

Recent years have seen the successful application of large pre-trained models to code representation learning, resulting in substantial improvements on many code-related downstream tasks. But there are issues surrounding their application…

Software Engineering · Computer Science 2022-05-26 Changan Niu , Chuanyi Li , Vincent Ng , Jidong Ge , Liguo Huang , Bin Luo

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

Large language models (LLMs) have made significant advancements in code-related tasks, yet many LLMs treat code as simple sequences, neglecting its structured nature. We introduce AST-T5, a novel pretraining paradigm that leverages the…

Software Engineering · Computer Science 2024-06-25 Linyuan Gong , Mostafa Elhoushi , Alvin Cheung

TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation

Artificial intelligence (AI) has revolutionized software engineering (SE) by enhancing software development efficiency. The advent of pre-trained models (PTMs) leveraging transfer learning has significantly advanced AI for SE. However,…

Software Engineering · Computer Science 2024-04-25 Zixiang Xian , Rubing Huang , Dave Towey , Chunrong Fang , Zhenyu Chen

TransCoder: A Neural-Enhancement Framework for Channel Codes

Reliable communication over noisy channels requires the design of specialized error-correcting codes (ECCs) tailored to specific system requirements. Recently, neural network-based decoders have emerged as promising tools for enhancing ECC…

Information Theory · Computer Science 2025-12-01 Anastasiia Kurmukova , Selim F. Yilmaz , Emre Ozfatura , Deniz Gunduz

Improving Tree-Structured Decoder Training for Code Generation via Mutual Learning

Code generation aims to automatically generate a piece of code given an input natural language utterance. Currently, among dominant models, it is treated as a sequence-to-tree task, where a decoder outputs a sequence of actions…

Artificial Intelligence · Computer Science 2021-06-01 Binbin Xie , Jinsong Su , Yubin Ge , Xiang Li , Jianwei Cui , Junfeng Yao , Bin Wang

TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation

Large language models (LLMs) have shown remarkable ability to generate code, yet their outputs often violate syntactic or semantic constraints when guided only through natural language prompts. We introduce TreeCoder, the most general and…

Machine Learning · Computer Science 2026-04-27 Henrijs Princis , Arindam Sharma , Cristina David

A Survey of Deep Learning Models for Structural Code Understanding

In recent years, the rise of deep learning and automation requirements in the software industry has elevated Intelligent Software Engineering to new heights. The number of approaches and applications in code understanding is growing, with…

Software Engineering · Computer Science 2022-05-04 Ruoting Wu , Yuxin Zhang , Qibiao Peng , Liang Chen , Zibin Zheng

CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing

Currently, a growing number of mature natural language processing applications make people's life more convenient. Such applications are built by source code - the language in software engineering. However, the applications for…

Software Engineering · Computer Science 2021-05-13 Ahmed Elnaggar , Wei Ding , Llion Jones , Tom Gibbs , Tamas Feher , Christoph Angerer , Silvia Severini , Florian Matthes , Burkhard Rost

Fine-grained Pseudo-code Generation Method via Code Feature Extraction and Transformer

Pseudo-code written by natural language is helpful for novice developers' program comprehension. However, writing such pseudo-code is time-consuming and laborious. Motivated by the research advancements of sequence-to-sequence learning and…

Software Engineering · Computer Science 2021-09-22 Guang Yang , Yanlin Zhou , Xiang Chen , Chi Yu

Transformer with Tree-order Encoding for Neural Program Generation

While a considerable amount of semantic parsing approaches have employed RNN architectures for code generation tasks, there have been only few attempts to investigate the applicability of Transformers for this task. Including hierarchical…

Computation and Language · Computer Science 2022-06-28 Klaudia-Doris Thellmann , Bernhard Stadler , Ricardo Usbeck , Jens Lehmann

EgoCoder: Intelligent Program Synthesis with Hierarchical Sequential Neural Network Model

Programming has been an important skill for researchers and practitioners in computer science and other related areas. To learn basic programing skills, a long-time systematic training is usually required for beginners. According to a…

Artificial Intelligence · Computer Science 2018-05-23 Jiawei Zhang , Limeng Cui , Fisher B. Gouza

TreeGen: A Tree-Based Transformer Architecture for Code Generation

A code generation system generates programming language code based on an input natural language description. State-of-the-art approaches rely on neural networks for code generation. However, these code generators suffer from two problems.…

Machine Learning · Computer Science 2019-12-02 Zeyu Sun , Qihao Zhu , Yingfei Xiong , Yican Sun , Lili Mou , Lu Zhang

StrucTexT: Structured Text Understanding with Multi-Modal Transformers

Structured text understanding on Visually Rich Documents (VRDs) is a crucial part of Document Intelligence. Due to the complexity of content and layout in VRDs, structured text understanding has been a challenging task. Most existing…

Computer Vision and Pattern Recognition · Computer Science 2021-11-09 Yulin Li , Yuxi Qian , Yuchen Yu , Xiameng Qin , Chengquan Zhang , Yan Liu , Kun Yao , Junyu Han , Jingtuo Liu , Errui Ding

Tree-Transformer: A Transformer-Based Method for Correction of Tree-Structured Data

Many common sequential data sources, such as source code and natural language, have a natural tree-structured representation. These trees can be generated by fitting a sequence to a grammar, yielding a hierarchical ordering of the tokens in…

Machine Learning · Computer Science 2019-08-02 Jacob Harer , Chris Reale , Peter Chin

Planning with Large Language Models for Code Generation

Existing large language model-based code generation pipelines typically use beam search or sampling algorithms during the decoding process. Although the programs they generate achieve high token-matching-based scores, they often fail to…

Machine Learning · Computer Science 2023-03-10 Shun Zhang , Zhenfang Chen , Yikang Shen , Mingyu Ding , Joshua B. Tenenbaum , Chuang Gan

What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Code

Recently, many pre-trained language models for source code have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code completion, code search, and code summarization. These…

Software Engineering · Computer Science 2022-02-15 Yao Wan , Wei Zhao , Hongyu Zhang , Yulei Sui , Guandong Xu , Hai Jin

AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization

Code summarization aims to generate brief natural language descriptions for source code. As source code is highly structured and follows strict programming language grammars, its Abstract Syntax Tree (AST) is often leveraged to inform the…

Computation and Language · Computer Science 2021-12-03 Ze Tang , Chuanyi Li , Jidong Ge , Xiaoyu Shen , Zheling Zhu , Bin Luo

Code Structure Guided Transformer for Source Code Summarization

Code summaries help developers comprehend programs and reduce their time to infer the program functionalities during software maintenance. Recent efforts resort to deep learning techniques such as sequence-to-sequence models for generating…

Computation and Language · Computer Science 2023-02-08 Shuzheng Gao , Cuiyun Gao , Yulan He , Jichuan Zeng , Lun Yiu Nie , Xin Xia , Michael R. Lyu

Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges

Deep Learning (DL) techniques for Natural Language Processing have been evolving remarkably fast. Recently, the DL advances in language modeling, machine translation and paragraph understanding are so prominent that the potential of DL in…

Software Engineering · Computer Science 2020-06-16 Triet H. M. Le , Hao Chen , M. Ali Babar