Related papers: A Transformer-based Approach for Source Code Summa…

Length-controllable Abstractive Summarization by Guiding with Summary Prototype

We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization,…

Computation and Language · Computer Science 2020-01-22 Itsumi Saito , Kyosuke Nishida , Kosuke Nishida , Atsushi Otsuka , Hisako Asano , Junji Tomita , Hiroyuki Shindo , Yuji Matsumoto

Source code summarization involves creating brief descriptions of source code in natural language. These descriptions are a key component of software documentation such as JavaDocs. Automatic code summarization is a prized target of…

Software Engineering · Computer Science 2022-04-05 Sakib Haque , Zachary Eberhart , Aakash Bansal , Collin McMillan

Explicit Reordering for Neural Machine Translation

In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency, which makes the Transformer-based NMT achieve…

Computation and Language · Computer Science 2020-04-09 Kehai Chen , Rui Wang , Masao Utiyama , Eiichiro Sumita

Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey

Transformer-based models are becoming a central paradigm in autonomous driving because they can capture long-range spatial dependencies, multi-agent interactions, and multimodal context across perception, prediction, and planning. At the…

Machine Learning · Computer Science 2026-05-13 Juan Zhong , Yuhang Shi , Zukang Xu , Xi Chen

Input Combination Strategies for Multi-Source Transformer Decoder

In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder…

Computation and Language · Computer Science 2018-11-13 Jindřich Libovický , Jindřich Helcl , David Mareček

Program Understanding: A Reengineering Case for the Transformation Tool Contest

In Software Reengineering, one of the central artifacts is the source code of the legacy system in question. In fact, in most cases it is the only definitive artifact, because over the time the code has diverged from the original…

Software Engineering · Computer Science 2011-11-22 Tassilo Horn

A Decoding Algorithm for Length-Control Summarization Based on Directed Acyclic Transformers

Length-control summarization aims to condense long texts into a short one within a certain length limit. Previous approaches often use autoregressive (AR) models and treat the length requirement as a soft constraint, which may not always be…

Computation and Language · Computer Science 2025-02-10 Chenyang Huang , Hao Zhou , Cameron Jen , Kangjie Zheng , Osmar R. Zaïane , Lili Mou

Transformer Dissection: A Unified Understanding of Transformer's Attention via the Lens of Kernel

Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the…

Machine Learning · Computer Science 2019-11-13 Yao-Hung Hubert Tsai , Shaojie Bai , Makoto Yamada , Louis-Philippe Morency , Ruslan Salakhutdinov

Code Representation Learning with Pr\"ufer Sequences

An effective and efficient encoding of the source code of a computer program is critical to the success of sequence-to-sequence deep neural network models for tasks in computer program comprehension, such as automated code summarization and…

Artificial Intelligence · Computer Science 2021-11-16 Tenzin Jinpa , Yong Gao

Learning to Summarize Long Texts with Memory Compression and Transfer

We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recurrent neural network based encoder decoder architectures and we explore its use for abstractive document summarization. Mem2Mem transfers "memories" via…

Computation and Language · Computer Science 2020-10-23 Jaehong Park , Jonathan Pilault , Christopher Pal

Transformers predicting the future. Applying attention in next-frame and time series forecasting

Recurrent Neural Networks were, until recently, one of the best ways to capture the timely dependencies in sequences. However, with the introduction of the Transformer, it has been proven that an architecture with only attention-mechanisms…

Machine Learning · Computer Science 2021-08-19 Radostin Cholakov , Todor Kolev

Shortformer: Better Language Modeling using Shorter Inputs

Increasing the input length has been a driver of progress in language modeling with transformers. We identify conditions where shorter inputs are not harmful, and achieve perplexity and efficiency improvements through two new methods that…

Computation and Language · Computer Science 2021-06-04 Ofir Press , Noah A. Smith , Mike Lewis

Assessment of Transformer-Based Encoder-Decoder Model for Human-Like Summarization

In recent times, extracting valuable information from large text is making significant progress. Especially in the current era of social media, people expect quick bites of information. Automatic text summarization seeks to tackle this by…

Computation and Language · Computer Science 2024-10-23 Sindhu Nair , Y. S. Rao , Radha Shankarmani

MeetSum: Transforming Meeting Transcript Summarization using Transformers!

Creating abstractive summaries from meeting transcripts has proven to be challenging due to the limited amount of labeled data available for training neural network models. Moreover, Transformer-based architectures have proven to beat…

Computation and Language · Computer Science 2021-08-16 Nima Sadri , Bohan Zhang , Bihan Liu

TransfoRNN: Capturing the Sequential Information in Self-Attention Representations for Language Modeling

In this paper, we describe the use of recurrent neural networks to capture sequential information from the self-attention representations to improve the Transformers. Although self-attention mechanism provides a means to exploit long…

Computation and Language · Computer Science 2021-04-06 Tze Yuang Chong , Xuyang Wang , Lin Yang , Junjie Wang

Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks

Transformer-based models excel in various tasks but their generalization capabilities, especially in arithmetic reasoning, remain incompletely understood. Arithmetic tasks provide a controlled framework to explore these capabilities, yet…

Machine Learning · Computer Science 2025-08-07 Xingcheng Xu , Zibo Zhao , Haipeng Zhang , Yanqing Yang

Sequence Complementor: Complementing Transformers For Time Series Forecasting with Learnable Sequences

Since its introduction, the transformer has shifted the development trajectory away from traditional models (e.g., RNN, MLP) in time series forecasting, which is attributed to its ability to capture global dependencies within temporal…

Machine Learning · Computer Science 2025-01-07 Xiwen Chen , Peijie Qiu , Wenhui Zhu , Huayu Li , Hao Wang , Aristeidis Sotiras , Yalin Wang , Abolfazl Razi

Horizontal and Vertical Attention in Transformers

Transformers are built upon multi-head scaled dot-product attention and positional encoding, which aim to learn the feature representations and token dependencies. In this work, we focus on enhancing the distinctive representation by…

Computer Vision and Pattern Recognition · Computer Science 2022-07-12 Litao Yu , Jian Zhang

Transformer-Based Source-Free Domain Adaptation

In this paper, we study the task of source-free domain adaptation (SFDA), where the source data are not available during target adaptation. Previous works on SFDA mainly focus on aligning the cross-domain distributions. However, they ignore…

Computer Vision and Pattern Recognition · Computer Science 2021-06-01 Guanglei Yang , Hao Tang , Zhun Zhong , Mingli Ding , Ling Shao , Nicu Sebe , Elisa Ricci

SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations

Recent years have seen the successful application of large pre-trained models to code representation learning, resulting in substantial improvements on many code-related downstream tasks. But there are issues surrounding their application…

Software Engineering · Computer Science 2022-05-26 Changan Niu , Chuanyi Li , Vincent Ng , Jidong Ge , Liguo Huang , Bin Luo