English
Related papers

Related papers: Seamlessly Integrating Tree-Based Positional Embed…

200 papers

Source code can be parsed into the abstract syntax tree (AST) based on defined syntax rules. However, in pre-training, little work has considered the incorporation of tree structure into the learning process. In this paper, we present…

Machine Learning · Computer Science 2021-07-16 Xue Jiang , Zhuoran Zheng , Chen Lyu , Liang Li , Lei Lyu

Source code representation with deep learning techniques is an important research field. There have been many studies that learn sequential or structural information for code representation. But sequence-based models and non-sequence-models…

Software Engineering · Computer Science 2023-03-15 Kechi Zhang , Zhuo Li , Zhi Jin , Ge Li

While a considerable amount of semantic parsing approaches have employed RNN architectures for code generation tasks, there have been only few attempts to investigate the applicability of Transformers for this task. Including hierarchical…

Computation and Language · Computer Science 2022-06-28 Klaudia-Doris Thellmann , Bernhard Stadler , Ricardo Usbeck , Jens Lehmann

Many common sequential data sources, such as source code and natural language, have a natural tree-structured representation. These trees can be generated by fitting a sequence to a grammar, yielding a hierarchical ordering of the tokens in…

Machine Learning · Computer Science 2019-08-02 Jacob Harer , Chris Reale , Peter Chin

Incorporating hierarchical structures like constituency trees has been shown to be effective for various natural language processing (NLP) tasks. However, it is evident that state-of-the-art (SOTA) sequence-based models like the Transformer…

Machine Learning · Computer Science 2020-02-20 Xuan-Phi Nguyen , Shafiq Joty , Steven C. H. Hoi , Richard Socher

Learning vector representations for programs is a critical step in applying deep learning techniques for program understanding tasks. Various neural network models are proposed to learn from tree-structured program representations, e.g.,…

Software Engineering · Computer Science 2023-01-10 Wenhan Wang , Kechi Zhang , Ge Li , Shangqing Liu , Anran Li , Zhi Jin , Yang Liu

An effective and efficient encoding of the source code of a computer program is critical to the success of sequence-to-sequence deep neural network models for tasks in computer program comprehension, such as automated code summarization and…

Artificial Intelligence · Computer Science 2021-11-16 Tenzin Jinpa , Yong Gao

Deep neural networks for machine comprehension typically utilizes only word or character embeddings without explicitly taking advantage of structured linguistic information such as constituency trees and dependency trees. In this paper, we…

Computation and Language · Computer Science 2017-09-04 Rui Liu , Junjie Hu , Wei Wei , Zi Yang , Eric Nyberg

Programming language understanding and representation (a.k.a code representation learning) has always been a hot and challenging task in software engineering. It aims to apply deep learning techniques to produce numerical representations of…

Software Engineering · Computer Science 2023-12-04 Weisong Sun , Chunrong Fang , Yun Miao , Yudu You , Mengzhe Yuan , Yuchen Chen , Quanjun Zhang , An Guo , Xiang Chen , Yang Liu , Zhenyu Chen

We propose a method to create document representations that reflect their internal structure. We modify Tree-LSTMs to hierarchically merge basic elements such as words and sentences into blocks of increasing complexity. Our Structure…

Computation and Language · Computer Science 2019-10-08 Khalil Mrini , Claudiu Musat , Michael Baeriswyl , Martin Jaggi

Current language models tailored for code tasks often adopt the pre-training-then-fine-tuning paradigm from natural language processing, modeling source code as plain text. This approach, however, overlooks the unambiguous structures…

Computation and Language · Computer Science 2024-01-22 Mayank Agarwal , Yikang Shen , Bailin Wang , Yoon Kim , Jie Chen

The current state-of-the-art task-oriented semantic parsing models use BERT or RoBERTa as pretrained encoders; these models have huge memory footprints. This poses a challenge to their deployment for voice assistants such as Amazon Alexa…

Computation and Language · Computer Science 2020-10-13 Prafull Prakash , Saurabh Kumar Shashidhar , Wenlong Zhao , Subendhu Rongali , Haidar Khan , Michael Kayser

Pre-trained transformer models shine in many natural language processing tasks and therefore are expected to bear the representation of the input sentence or text meaning. These sentence-level embeddings are also important in…

Computation and Language · Computer Science 2025-02-21 Lukas Stankevičius , Mantas Lukoševičius

Code summarization aims to generate concise natural language descriptions of source code, which can help improve program comprehension and maintenance. Recent studies show that syntactic and structural information extracted from abstract…

Software Engineering · Computer Science 2021-12-01 Ensheng Shi , Yanlin Wang , Lun Du , Hongyu Zhang , Shi Han , Dongmei Zhang , Hongbin Sun

Understanding the learning process and the embedded computation in transformers is becoming a central goal for the development of interpretable AI. In the present study, we introduce a hierarchical filtering procedure for data models of…

Machine Learning · Computer Science 2025-06-11 Jerome Garnier-Brun , Marc Mézard , Emanuele Moscato , Luca Saglietti

Code summarization aims to generate brief natural language descriptions for source code. As source code is highly structured and follows strict programming language grammars, its Abstract Syntax Tree (AST) is often leveraged to inform the…

Computation and Language · Computer Science 2021-12-03 Ze Tang , Chuanyi Li , Jidong Ge , Xiaoyu Shen , Zheling Zhu , Bin Luo

Performance analysis has always been an afterthought during the application development process, focusing on application correctness first. The learning curve of the existing static and dynamic analysis tools are steep, which requires…

Machine Learning · Computer Science 2021-04-23 Nathan Pinnow , Tarek Ramadan , Tanzima Z. Islam , Chase Phelps , Jayaraman J. Thiagarajan

Recently, many pre-trained language models for source code have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code completion, code search, and code summarization. These…

Software Engineering · Computer Science 2022-02-15 Yao Wan , Wei Zhao , Hongyu Zhang , Yulei Sui , Guandong Xu , Hai Jin

We present a modular approach to building cascade speech translation (AST) models that guarantees that the resulting model performs no worse than the 1-best cascade baseline while preserving state-of-the-art speech recognition (ASR) and…

Computation and Language · Computer Science 2024-07-26 Ciprian Chelba , Johan Schalkwyk

Artificial intelligence (AI) has revolutionized software engineering (SE) by enhancing software development efficiency. The advent of pre-trained models (PTMs) leveraging transfer learning has significantly advanced AI for SE. However,…

Software Engineering · Computer Science 2024-04-25 Zixiang Xian , Rubing Huang , Dave Towey , Chunrong Fang , Zhenyu Chen
‹ Prev 1 2 3 10 Next ›