English
Related papers

Related papers: Formal Algorithms for Transformers

200 papers

Transformers have dominated empirical machine learning models of natural language processing. In this paper, we introduce basic concepts of Transformers and present key techniques that form the recent advances of these models. This includes…

Computation and Language · Computer Science 2023-11-30 Tong Xiao , Jingbo Zhu

The transformer is a neural network component that can be used to learn useful representations of sequences or sets of data-points. The transformer has driven recent advances in natural language processing, computer vision, and…

Machine Learning · Computer Science 2026-01-21 Richard E. Turner

Transformers are a neural network architecture originally developed for natural language processing, which have since become a foundational tool for solving a wide range of problems, including text, audio, image processing, reinforcement…

Computation and Language · Computer Science 2025-05-06 Jordi de la Torre

The introduction of Transformers architecture has brought about significant breakthroughs in Deep Learning (DL), particularly within Natural Language Processing (NLP). Since their inception, Transformers have outperformed many traditional…

Robotics · Computer Science 2024-12-17 Nikunj Sanghai , Nik Bear Brown

Transformer architecture has widespread applications, particularly in Natural Language Processing and computer vision. Recently Transformers have been employed in various aspects of time-series analysis. This tutorial provides an overview…

Machine Learning · Computer Science 2023-07-27 Sabeen Ahmed , Ian E. Nielsen , Aakash Tripathi , Shamoon Siddiqui , Ghulam Rasool , Ravi P. Ramachandran

Transformer has been considered the dominating neural architecture in NLP and CV, mostly under supervised settings. Recently, a similar surge of using Transformers has appeared in the domain of reinforcement learning (RL), but it is faced…

Machine Learning · Computer Science 2023-09-22 Wenzhe Li , Hao Luo , Zichuan Lin , Chongjie Zhang , Zongqing Lu , Deheng Ye

Understanding the transformer architecture and its workings is essential for machine learning (ML) engineers. However, truly understanding the transformer architecture can be demanding, even if you have a solid background in machine…

Machine Learning · Computer Science 2025-02-28 Joni-Kristian Kämäräinen

Much theoretical work has described the ability of transformers to represent formal languages. However, linking theoretical results to empirical performance is not straightforward due to the complex interplay between the architecture, the…

Computation and Language · Computer Science 2024-10-07 Anej Svete , Nadav Borenstein , Mike Zhou , Isabelle Augenstein , Ryan Cotterell

The fields of generative AI and transfer learning have experienced remarkable advancements in recent years especially in the domain of Natural Language Processing (NLP). Transformers have been at the heart of these advancements where the…

Computation and Language · Computer Science 2024-02-28 Majd Saleh , Stéphane Paquelet

We establish connections between the Transformer architecture, originally introduced for natural language processing, and Graph Neural Networks (GNNs) for representation learning on graphs. We show how Transformers can be viewed as message…

Machine Learning · Computer Science 2025-06-30 Chaitanya K. Joshi

With Artificial Intelligence (AI) increasingly permeating various aspects of society, including healthcare, the adoption of the Transformers neural network architecture is rapidly changing many applications. Transformer is a type of deep…

This document provides a brief introduction to the attention mechanism used in modern language models based on the Transformer architecture. We first illustrate how text is encoded as vectors and how the attention mechanism processes these…

Numerical Analysis · Mathematics 2026-04-02 Michel Fabrice Serret

Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks. This survey explores how transformers are used in…

Machine Learning · Computer Science 2023-07-13 Pranav Agarwal , Aamer Abdul Rahman , Pierre-Luc St-Charles , Simon J. D. Prince , Samira Ebrahimi Kahou

Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and reinforcement learning. In the field of natural language processing for example,…

Machine Learning · Computer Science 2022-03-15 Yi Tay , Mostafa Dehghani , Dara Bahri , Donald Metzler

Transformers have dominated the field of natural language processing, and recently impacted the computer vision area. In the field of medical image analysis, Transformers have also been successfully applied to full-stack clinical…

Computer Vision and Pattern Recognition · Computer Science 2022-08-22 Kelei He , Chen Gan , Zhuoyuan Li , Islem Rekik , Zihao Yin , Wen Ji , Yang Gao , Qian Wang , Junfeng Zhang , Dinggang Shen

In the rapidly evolving landscape of genomics, deep learning has emerged as a useful tool for tackling complex computational challenges. This review focuses on the transformative role of Large Language Models (LLMs), which are mostly based…

As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as formal languages. Exploring such questions can help…

Machine Learning · Computer Science 2024-09-05 Lena Strobl , William Merrill , Gail Weiss , David Chiang , Dana Angluin

Pretraining on large, semantically rich datasets is key for developing language models. Surprisingly, recent studies have shown that even synthetic data, generated procedurally through simple semantic-free algorithms, can yield some of the…

Machine Learning · Computer Science 2025-05-29 Zachary Shinnick , Liangze Jiang , Hemanth Saratchandran , Anton van den Hengel , Damien Teney

Many NLP applications require models to be interpretable. However, many successful neural architectures, including transformers, still lack effective interpretation methods. A possible solution could rely on building explanations from…

Computation and Language · Computer Science 2024-04-04 Federico Ruggeri , Marco Lippi , Paolo Torroni

Much algorithmic research in NLP aims to efficiently manipulate rich formal structures. An algorithm designer typically seeks to provide guarantees about their proposed algorithm -- for example, that its running time or space complexity is…

Programming Languages · Computer Science 2025-12-30 Tim Vieira , Ryan Cotterell , Jason Eisner
‹ Prev 1 2 3 10 Next ›