English
Related papers

Related papers: Conformer LLMs -- Convolution Augmented Large Lang…

200 papers

Conformer has proven to be effective in many speech processing tasks. It combines the benefits of extracting local dependencies using convolutions and global dependencies using self-attention. Inspired by this, we propose a more flexible,…

Computation and Language · Computer Science 2022-07-08 Yifan Peng , Siddharth Dalmia , Ian Lane , Shinji Watanabe

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-19 Anmol Gulati , James Qin , Chung-Cheng Chiu , Niki Parmar , Yu Zhang , Jiahui Yu , Wei Han , Shibo Wang , Zhengdong Zhang , Yonghui Wu , Ruoming Pang

Convolutions have become essential in state-of-the-art end-to-end Automatic Speech Recognition~(ASR) systems due to their efficient modelling of local context. Notably, its use in Conformers has led to superior performance compared to…

Computation and Language · Computer Science 2024-07-25 Darshan Prabhu , Yifan Peng , Preethi Jyothi , Shinji Watanabe

Transformer-based Large Language Models (LLMs) have been applied in diverse areas such as knowledge bases, human interfaces, and dynamic agents, and marking a stride towards achieving Artificial General Intelligence (AGI). However, current…

Computation and Language · Computer Science 2024-02-27 Yunpeng Huang , Jingwei Xu , Junyu Lai , Zixu Jiang , Taolue Chen , Zenan Li , Yuan Yao , Xiaoxing Ma , Lijuan Yang , Hao Chen , Shupeng Li , Penghao Zhao

In the rapidly evolving landscape of genomics, deep learning has emerged as a useful tool for tackling complex computational challenges. This review focuses on the transformative role of Large Language Models (LLMs), which are mostly based…

Large Language Models (LLMs) have delivered impressive results in language understanding, generation, reasoning, and pushes the ability boundary of multimodal models. Transformer models, as the foundation of modern LLMs, offer a strong…

Computation and Language · Computer Science 2025-08-14 Weigao Sun , Jiaxi Hu , Yucheng Zhou , Jusen Du , Disen Lan , Kexin Wang , Tong Zhu , Xiaoye Qu , Yu Zhang , Xiaoyu Mo , Daizong Liu , Yuxuan Liang , Wenliang Chen , Guoqi Li , Yu Cheng

Large Language Models (LLMs), powered by Transformers, have demonstrated human-like intelligence capabilities, yet their underlying mechanisms remain poorly understood. This paper presents a novel framework for interpreting LLMs as…

Computation and Language · Computer Science 2025-04-16 Phill Kyu Rhee

The Transformer architecture has become prominent in developing large causal language models. However, mechanisms to explain its capabilities are not well understood. Focused on the training process, here we establish a meta-learning view…

Machine Learning · Computer Science 2024-03-26 Xinbo Wu , Lav R. Varshney

Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot…

Computation and Language · Computer Science 2024-05-29 Chen Wang , Jin Zhao , Jiaqi Gong

Large Language Models (LLMs) represent a class of deep learning models adept at understanding natural language and generating coherent responses to various prompts or queries. These models far exceed the complexity of conventional neural…

Machine Learning · Computer Science 2024-12-05 Minghao Shao , Abdul Basit , Ramesh Karri , Muhammad Shafique

Transformer architectures are the backbone of the modern AI revolution. However, they are based on simply stacking the same blocks in dozens of layers and processing information sequentially from one block to another. In this paper, we…

Computation and Language · Computer Science 2024-12-24 Prateek Verma , Mert Pilanci

Pre-trained Transformer language models (LM) have become go-to text representation encoders. Prior research fine-tunes deep LMs to encode text sequences such as sentences and passages into single dense vector representations for efficient…

Computation and Language · Computer Science 2021-09-22 Luyu Gao , Jamie Callan

Even though large language models (LLMs) have demonstrated remarkable capability in solving various natural language tasks, the capability of an LLM to follow human instructions is still a concern. Recent works have shown great improvements…

Computation and Language · Computer Science 2024-03-05 Xinbo Wu , Lav R. Varshney

Convolutional neural networks (CNN) have improved speech recognition performance greatly by exploiting localized time-frequency patterns. But these patterns are assumed to appear in symmetric and rigid kernels by the conventional CNN…

Audio and Speech Processing · Electrical Eng. & Systems 2025-06-19 Jiamin Xie , John H. L. Hansen

Large language models have demonstrated remarkable performance across various tasks, yet they face challenges such as low computational efficiency, gradient vanishing, and difficulties in capturing complex feature interactions. To address…

Computation and Language · Computer Science 2025-03-21 Cheng Li , Jiexiong Liu , Yixuan Chen , Yanqin Jia , Zhepeng Li

Transformer architectures contribute to managing long-term dependencies for Natural Language Processing, representing one of the most recent changes in the field. These architectures are the basis of the innovative, cutting-edge Large…

Computation and Language · Computer Science 2024-09-06 Silvia García-Méndez , Francisco de Arriba-Pérez , María del Carmen Somoza-López

Transformers have become the dominant architecture for sequence modeling tasks such as natural language processing or audio processing, and they are now even considered for tasks that are not naturally sequential such as image…

Machine Learning · Computer Science 2024-03-05 Jorg Bornschein , Yazhe Li , Amal Rannen-Triki

Modeling long-term dependencies for audio signals is a particularly challenging problem, as even small-time scales yield on the order of a hundred thousand samples. With the recent advent of Transformers, neural architectures became good at…

Sound · Computer Science 2024-12-24 Prateek Verma

Large language models (LLMs) based on Transformer have been widely applied in the filed of natural language processing (NLP), demonstrating strong performance, particularly in handling short text tasks. However, when it comes to long…

Computation and Language · Computer Science 2025-07-09 Yijun Liu , Jinzheng Yu , Yang Xu , Zhongyang Li , Qingfu Zhu

Multilingual Large Language Models (LLMs) can process many languages, yet how they internally represent this diversity remains unclear. Do they form shared multilingual representations with language-specific decoding, and if so, why does…

Computation and Language · Computer Science 2026-02-10 Abir Harrasse , Florent Draye , Punya Syon Pandey , Zhijing Jin , Bernhard Schölkopf
‹ Prev 1 2 3 10 Next ›