English
Related papers

Related papers: Structured Code Representations Enable Data-Effici…

200 papers

Recently, many pre-trained language models for source code have been proposed to model the context of code and serve as a basis for downstream code intelligence tasks such as code completion, code search, and code summarization. These…

Software Engineering · Computer Science 2022-02-15 Yao Wan , Wei Zhao , Hongyu Zhang , Yulei Sui , Guandong Xu , Hai Jin

Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding, which represent the input as a token sequence without explicitly modeling its structure. Some prior works…

Computation and Language · Computer Science 2022-10-27 Da Shen , Xinyun Chen , Chenguang Wang , Koushik Sen , Dawn Song

With the great success of pre-trained models, the pretrain-then-finetune paradigm has been widely adopted on downstream tasks for source code understanding. However, compared to costly training a large-scale model from scratch, how to…

Software Engineering · Computer Science 2022-03-16 Deze Wang , Zhouyang Jia , Shanshan Li , Yue Yu , Yun Xiong , Wei Dong , Xiangke Liao

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code…

An effective and efficient encoding of the source code of a computer program is critical to the success of sequence-to-sequence deep neural network models for tasks in computer program comprehension, such as automated code summarization and…

Artificial Intelligence · Computer Science 2021-11-16 Tenzin Jinpa , Yong Gao

Programming language understanding and representation (a.k.a code representation learning) has always been a hot and challenging task in software engineering. It aims to apply deep learning techniques to produce numerical representations of…

Software Engineering · Computer Science 2023-12-04 Weisong Sun , Chunrong Fang , Yun Miao , Yudu You , Mengzhe Yuan , Yuchen Chen , Quanjun Zhang , An Guo , Xiang Chen , Yang Liu , Zhenyu Chen

Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large…

Software Engineering · Computer Science 2023-04-12 Ensheng Shi , Yanlin Wang , Hongyu Zhang , Lun Du , Shi Han , Dongmei Zhang , Hongbin Sun

Performance analysis has always been an afterthought during the application development process, focusing on application correctness first. The learning curve of the existing static and dynamic analysis tools are steep, which requires…

Machine Learning · Computer Science 2021-04-23 Nathan Pinnow , Tarek Ramadan , Tanzima Z. Islam , Chase Phelps , Jayaraman J. Thiagarajan

Large pre-trained language models have recently been expanded and applied to programming language tasks with great success, often through further pre-training of a strictly-natural language model--where training sequences typically contain…

Computation and Language · Computer Science 2024-02-13 Fenia Christopoulou , Guchun Zhang , Gerasimos Lampouras

Large language models (LLMs) for natural language processing have been grafted onto programming language modeling for advancing code intelligence. Although it can be represented in the text format, code is syntactically more rigorous in…

Software Engineering · Computer Science 2023-09-20 Jiabo Huang , Jianyu Zhao , Yuyang Rong , Yiwen Guo , Yifeng He , Hao Chen

Large pre-trained language models have been used to generate code,providing a flexible interface for synthesizing programs from natural language specifications. However, they often violate syntactic and semantic rules of their output…

Machine Learning · Computer Science 2022-01-28 Gabriel Poesia , Oleksandr Polozov , Vu Le , Ashish Tiwari , Gustavo Soares , Christopher Meek , Sumit Gulwani

Pre-trained language models are effective in a variety of natural language tasks, but it has been argued their capabilities fall short of fully learning meaning or understanding language. To understand the extent to which language models…

Software Engineering · Computer Science 2024-02-29 Toufique Ahmed , Dian Yu , Chengxuan Huang , Cathy Wang , Prem Devanbu , Kenji Sagae

The objective of pre-trained language models is to learn contextual representations of textual data. Pre-trained language models have become mainstream in natural language processing and code modeling. Using probes, a technique to study the…

Computation and Language · Computer Science 2022-09-13 José Antonio Hernández López , Martin Weyssow , Jesús Sánchez Cuadrado , Houari Sahraoui

Source code can be parsed into the abstract syntax tree (AST) based on defined syntax rules. However, in pre-training, little work has considered the incorporation of tree structure into the learning process. In this paper, we present…

Machine Learning · Computer Science 2021-07-16 Xue Jiang , Zhuoran Zheng , Chen Lyu , Liang Li , Lei Lyu

Source code (Context) and its parsed abstract syntax tree (AST; Structure) are two complementary representations of the same computer program. Traditionally, designers of machine learning models have relied predominantly either on Structure…

Machine Learning · Computer Science 2021-03-23 Daniel Zügner , Tobias Kirschstein , Michele Catasta , Jure Leskovec , Stephan Günnemann

Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code. However, most pre-trained models for code intelligence ignore the execution trace and only rely on source code and…

Programming Languages · Computer Science 2023-05-10 Chenxiao Liu , Shuai Lu , Weizhu Chen , Daxin Jiang , Alexey Svyatkovskiy , Shengyu Fu , Neel Sundaresan , Nan Duan

Software is constantly changing, requiring developers to perform several derived tasks in a timely manner, such as writing a description for the intention of the code change, or identifying the defect-prone code changes. Considering that…

Software Engineering · Computer Science 2023-05-19 Bo Lin , Shangwen Wang , Zhongxin Liu , Yepang Liu , Xin Xia , Xiaoguang Mao

Understanding the relationships between protein sequence, structure and function is a long-standing biological challenge with manifold implications from drug design to our understanding of evolution. Recently, protein language models have…

Quantitative Methods · Quantitative Biology 2024-01-29 Dexiong Chen , Philip Hartout , Paolo Pellizzoni , Carlos Oliver , Karsten Borgwardt

Recent years have seen the successful application of large pre-trained models to code representation learning, resulting in substantial improvements on many code-related downstream tasks. But there are issues surrounding their application…

Software Engineering · Computer Science 2022-05-26 Changan Niu , Chuanyi Li , Vincent Ng , Jidong Ge , Liguo Huang , Bin Luo

Over the past few years, we have witnessed remarkable advancements in Code Pre-trained Models (CodePTMs). These models achieved excellent representation capabilities by designing structure-based pre-training tasks for code. However, how to…

Software Engineering · Computer Science 2024-04-12 Jiayi Wu , Renyu Zhu , Nuo Chen , Qiushi Sun , Xiang Li , Ming Gao
‹ Prev 1 2 3 10 Next ›