Related papers: Dynamically Relative Position Encoding-Based Trans…

Dynamic Position Encoding for Transformers

Recurrent models have been dominating the field of neural machine translation (NMT) for the past few years. Transformers \citep{vaswani2017attention}, have radically changed it by proposing a novel architecture that relies on a feed-forward…

Computation and Language · Computer Science 2022-10-25 Joyce Zheng , Mehdi Rezagholizadeh , Peyman Passban

On Learning Meaningful Code Changes via Neural Machine Translation

Recent years have seen the rise of Deep Learning (DL) techniques applied to source code. Researchers have exploited DL to automate several development and maintenance tasks, such as writing commit messages, generating comments and detecting…

Software Engineering · Computer Science 2019-01-29 Michele Tufano , Jevgenija Pantiuchina , Cody Watson , Gabriele Bavota , Denys Poshyvanyk

Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks

Deep learning (DL) techniques are gaining more and more attention in the software engineering community. They have been used to support several code-related tasks, such as automatic bug fixing and code comments generation. Recent studies in…

Software Engineering · Computer Science 2021-02-04 Antonio Mastropaolo , Simone Scalabrino , Nathan Cooper , David Nader Palacio , Denys Poshyvanyk , Rocco Oliveto , Gabriele Bavota

Using Transfer Learning for Code-Related Tasks

Deep learning (DL) techniques have been used to support several code-related tasks such as code summarization and bug-fixing. In particular, pre-trained transformer models are on the rise, also thanks to the excellent results they achieved…

Software Engineering · Computer Science 2022-06-20 Antonio Mastropaolo , Nathan Cooper , David Nader Palacio , Simone Scalabrino , Denys Poshyvanyk , Rocco Oliveto , Gabriele Bavota

A Simple and Effective Positional Encoding for Transformers

Transformer models are permutation equivariant. To supply the order and type information of the input tokens, position and segment embeddings are usually added to the input. Recent works proposed variations of positional encodings with…

Computation and Language · Computer Science 2021-11-04 Pu-Chin Chen , Henry Tsai , Srinadh Bhojanapalli , Hyung Won Chung , Yin-Wen Chang , Chun-Sung Ferng

DTMT: A Novel Deep Transition Architecture for Neural Machine Translation

Past years have witnessed rapid developments in Neural Machine Translation (NMT). Most recently, with advanced modeling and training techniques, the RNN-based NMT (RNMT) has shown its potential strength, even compared with the well-known…

Computation and Language · Computer Science 2019-07-17 Fandong Meng , Jinchao Zhang

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Large transformer models trained on diverse datasets have shown a remarkable ability to learn in-context, achieving high few-shot performance on tasks they were not explicitly trained to solve. In this paper, we study the in-context…

Machine Learning · Computer Science 2023-06-27 Jonathan N. Lee , Annie Xie , Aldo Pacchiano , Yash Chandak , Chelsea Finn , Ofir Nachum , Emma Brunskill

DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

Recently, Transformer-based methods, which predict polygon points or Bezier curve control points for localizing texts, are popular in scene text detection. However, these methods built upon detection transformer framework might achieve…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Maoyuan Ye , Jing Zhang , Shanshan Zhao , Juhua Liu , Bo Du , Dacheng Tao

Dynamic Visual Prompt Tuning for Parameter Efficient Transfer Learning

Parameter efficient transfer learning (PETL) is an emerging research spot that aims to adapt large-scale pre-trained models to downstream tasks. Recent advances have achieved great success in saving storage and computation costs. However,…

Computer Vision and Pattern Recognition · Computer Science 2023-09-13 Chunqing Ruan , Hongjian Wang

Relative Positional Encoding for Speech Recognition and Direct Translation

Transformer models are powerful sequence-to-sequence architectures that are capable of directly mapping speech inputs to transcriptions or translations. However, the mechanism for modeling positions in this model was tailored for text…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-21 Ngoc-Quan Pham , Thanh-Le Ha , Tuan-Nam Nguyen , Thai-Son Nguyen , Elizabeth Salesky , Sebastian Stueker , Jan Niehues , Alexander Waibel

P-Transformer: Towards Better Document-to-Document Neural Machine Translation

Directly training a document-to-document (Doc2Doc) neural machine translation (NMT) via Transformer from scratch, especially on small datasets usually fails to converge. Our dedicated probing tasks show that 1) both the absolute position…

Computation and Language · Computer Science 2022-12-13 Yachao Li , Junhui Li , Jing Jiang , Shimin Tao , Hao Yang , Min Zhang

Deformable Graph Transformer

Transformer-based models have recently shown success in representation learning on graph-structured data beyond natural language processing and computer vision. However, the success is limited to small-scale graphs due to the drawbacks of…

Machine Learning · Computer Science 2022-10-05 Jinyoung Park , Seongjun Yun , Hyeonjin Park , Jaewoo Kang , Jisu Jeong , Kyung-Min Kim , Jung-woo Ha , Hyunwoo J. Kim

Stack Trace-Based Crash Deduplication with Transformer Adaptation

Automated crash reporting systems generate large volumes of duplicate reports, overwhelming issue-tracking systems and increasing developer workload. Traditional stack trace-based deduplication methods, relying on string similarity,…

Software Engineering · Computer Science 2025-08-28 Md Afif Al Mamun , Gias Uddin , Lan Xia , Longyu Zhang

A Dynamic Transformer Network for Vehicle Detection

Stable consumer electronic systems can assist traffic better. Good traffic consumer electronic systems require collaborative work between traffic algorithms and hardware. However, performance of popular traffic algorithms containing vehicle…

Computer Vision and Pattern Recognition · Computer Science 2025-06-04 Chunwei Tian , Kai Liu , Bob Zhang , Zhixiang Huang , Chia-Wen Lin , David Zhang

Explicit Reordering for Neural Machine Translation

In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency, which makes the Transformer-based NMT achieve…

Computation and Language · Computer Science 2020-04-09 Kehai Chen , Rui Wang , Masao Utiyama , Eiichiro Sumita

Training Deeper Neural Machine Translation Models with Transparent Attention

While current state-of-the-art NMT models, such as RNN seq2seq and Transformers, possess a large number of parameters, they are still shallow in comparison to convolutional models used for both text and vision applications. In this work we…

Computation and Language · Computer Science 2018-09-06 Ankur Bapna , Mia Xu Chen , Orhan Firat , Yuan Cao , Yonghui Wu

Rethinking and Improving Relative Position Encoding for Vision Transformer

Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens. General efficacy has been proven in natural language processing. However, in computer vision, its efficacy is not well studied and…

Computer Vision and Pattern Recognition · Computer Science 2021-07-30 Kan Wu , Houwen Peng , Minghao Chen , Jianlong Fu , Hongyang Chao

DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction

Convolution neural networks (CNNs) and Transformers have their own advantages and both have been widely used for dense prediction in multi-task learning (MTL). Most of the current studies on MTL solely rely on CNN or Transformer. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-03-07 Yangyang Xu , Yibo Yang , Lefei Zhang

Exploring the Performance and Efficiency of Transformer Models for NLP on Mobile Devices

Deep learning (DL) is characterised by its dynamic nature, with new deep neural network (DNN) architectures and approaches emerging every few years, driving the field's advancement. At the same time, the ever-increasing use of mobile…

Machine Learning · Computer Science 2023-07-25 Ioannis Panopoulos , Sokratis Nikolaidis , Stylianos I. Venieris , Iakovos S. Venieris

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

This paper presents a general scheme for enhancing the convergence and performance of DETR (DEtection TRansformer). We investigate the slow convergence problem in transformers from a new perspective, suggesting that it arises from the…

Computer Vision and Pattern Recognition · Computer Science 2024-07-17 Xiuquan Hou , Meiqin Liu , Senlin Zhang , Ping Wei , Badong Chen , Xuguang Lan