English
Related papers

Related papers: Encoder-Decoder or Decoder-Only? Revisiting Encode…

200 papers

Code search is essential for code reuse, allowing developers to efficiently locate relevant code snippets. The advent of powerful decoder-only Large Language Models (LLMs) has revolutionized many code intelligence tasks. However, their…

Software Engineering · Computer Science 2026-04-23 Yuxuan Chen , Mingwei Liu , Guangsheng Ou , Anji Li , Dekun Dai , Yanlin Wang , Zibin Zheng

Large language models have become extremely popular recently due to their ability to achieve strong performance on a variety of tasks, such as text generation and rewriting, but their size and computation cost make them difficult to access,…

Computation and Language · Computer Science 2026-01-08 Anthony Lamelas

Large language models (LLMs), known for their exceptional reasoning capabilities, generalizability, and fluency across diverse domains, present a promising avenue for enhancing speech-related tasks. In this paper, we focus on integrating…

Computation and Language · Computer Science 2024-07-04 Chao-Wei Huang , Hui Lu , Hongyu Gong , Hirofumi Inaguma , Ilia Kulikov , Ruslan Mavlyutov , Sravya Popuri

Pre-trained language models based on masked language modeling (MLM) excel in natural language understanding (NLU) tasks. While fine-tuned MLM-based encoders consistently outperform causal language modeling decoders of comparable size,…

Computation and Language · Computer Science 2024-06-07 David Dukić , Jan Šnajder

While decoder-only large language models (LLMs) have shown impressive results, encoder-decoder models are still widely adopted in real-world applications for their inference efficiency and richer encoder representation. In this paper, we…

Computation and Language · Computer Science 2025-04-09 Biao Zhang , Fedor Moiseev , Joshua Ainslie , Paul Suganthan , Min Ma , Surya Bhupatiraju , Fede Lebron , Orhan Firat , Armand Joulin , Zhe Dong

Groundbreaking advancements in text-to-image generation have recently been achieved with the emergence of diffusion models. These models exhibit a remarkable ability to generate highly artistic and intricately detailed images based on…

Computer Vision and Pattern Recognition · Computer Science 2025-02-10 Ziyi Dong , Yao Xiao , Pengxu Wei , Liang Lin

Large language models (LLMs) have achieved remarkable success in the field of natural language processing, enabling better human-computer interaction using natural language. However, the seamless integration of speech signals into LLMs has…

Audio and Speech Processing · Electrical Eng. & Systems 2023-10-03 Jian Wu , Yashesh Gaur , Zhuo Chen , Long Zhou , Yimeng Zhu , Tianrui Wang , Jinyu Li , Shujie Liu , Bo Ren , Linquan Liu , Yu Wu

Learning high-quality text representations is fundamental to a wide range of NLP tasks. While encoder pretraining has traditionally relied on Masked Language Modeling (MLM), recent evidence suggests that decoder models pretrained with…

The field of neural machine translation (NMT) has changed with the advent of large language models (LLMs). Much of the recent emphasis in natural language processing (NLP) has been on modeling machine translation and many other problems…

Computation and Language · Computer Science 2025-06-03 Yingfeng Luo , Tong Zheng , Yongyu Mu , Bei Li , Qinghong Zhang , Yongqi Gao , Ziqiang Xu , Peinan Feng , Xiaoqian Liu , Tong Xiao , Jingbo Zhu

The dominance of large decoder-only language models has overshadowed encoder-decoder architectures, despite their fundamental efficiency advantages in sequence processing. For small language models (SLMs) - those with 1 billion parameters…

Computation and Language · Computer Science 2025-01-31 Mohamed Elfeki , Rui Liu , Chad Voegele

State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms,…

Computation and Language · Computer Science 2022-04-26 Kai Hui , Honglei Zhuang , Tao Chen , Zhen Qin , Jing Lu , Dara Bahri , Ji Ma , Jai Prakash Gupta , Cicero Nogueira dos Santos , Yi Tay , Don Metzler

Large language models (LLMs) based on decoder-only transformers have demonstrated superior text understanding capabilities compared to CLIP and T5-series models. However, the paradigm for utilizing current advanced LLMs in text-to-image…

Computer Vision and Pattern Recognition · Computer Science 2024-12-06 Bingqi Ma , Zhuofan Zong , Guanglu Song , Hongsheng Li , Yu Liu

Recently, decoder-only pre-trained large language models (LLMs), with several tens of billion parameters, have significantly impacted a wide range of natural language processing (NLP) tasks. While encoder-only or encoder-decoder pre-trained…

Computation and Language · Computer Science 2024-03-11 Aru Maekawa , Tsutomu Hirao , Hidetaka Kamigaito , Manabu Okumura

Large Language Models (LLMs) prompted to generate chain-of-thought (CoT) exhibit impressive reasoning capabilities. Recent attempts at prompt decomposition toward solving complex, multi-step reasoning problems depend on the ability of the…

Computation and Language · Computer Science 2024-02-28 Gurusha Juneja , Subhabrata Dutta , Soumen Chakrabarti , Sunny Manchanda , Tanmoy Chakraborty

Large Language Models (LLMs) for complex reasoning is often hindered by high computational costs and latency, while resource-efficient Small Language Models (SLMs) typically lack the necessary reasoning capacity. Existing collaborative…

Computation and Language · Computer Science 2026-01-09 Chengsong Huang , Tong Zheng , Langlin Huang , Jinyuan Li , Haolin Liu , Jiaxin Huang

Conventional research on large language models (LLMs) has primarily focused on refining output distributions, while paying less attention to the decoding process that transforms these distributions into final responses. Recent advances,…

Computation and Language · Computer Science 2025-10-28 Chenheng Zhang , Tianqi Du , Jizhe Zhang , Mingqing Xiao , Yifei Wang , Yisen Wang , Zhouchen Lin

In context learning (ICL) underpins recent advances in large language models (LLMs), although its role and performance in causal reasoning remains unclear. Causal reasoning demands multihop composition and strict conjunctive control, and…

Computation and Language · Computer Science 2025-12-12 Amartya Roy , Elamparithy M , Kripabandhu Ghosh , Ponnurangam Kumaraguru , Adrian de Wynter

Diffusion large language models (dLLMs) are compelling alternatives to autoregressive (AR) models because their denoising models operate over the entire sequence. The global planning and iterative refinement features of dLLMs are…

Computation and Language · Computer Science 2025-06-27 Shansan Gong , Ruixiang Zhang , Huangjie Zheng , Jiatao Gu , Navdeep Jaitly , Lingpeng Kong , Yizhe Zhang

Speech Large Language Models route speech encoder representations into an LLM decoder that typically accounts for over 90% of total parameters. We study how much of this decoder capacity is actually needed for speech tasks. Across two LLM…

Computation and Language · Computer Science 2026-03-06 Adel Moumen , Guangzhi Sun , Philip C Woodland

Decoder-based transformers, while revolutionizing language modeling and scaling to immense sizes, have not completely overtaken encoder-heavy architectures in natural language processing. Specifically, encoder-only models remain dominant in…

Computation and Language · Computer Science 2025-03-05 Paul Suganthan , Fedor Moiseev , Le Yan , Junru Wu , Jianmo Ni , Jay Han , Imed Zitouni , Enrique Alfonseca , Xuanhui Wang , Zhe Dong
‹ Prev 1 2 3 10 Next ›