English
Related papers

Related papers: Memory-Augmented Generative Adversarial Transforme…

200 papers

Transformer encoder-decoder models have achieved great performance in dialogue generation tasks, however, their inability to process long dialogue history often leads to truncation of the context To address this problem, we propose a novel…

Computation and Language · Computer Science 2023-05-24 Qingyang Wu , Zhou Yu

Large Language Models face significant challenges in maintaining coherent interactions over extended dialogues due to their limited contextual memory. This limitation often leads to fragmented exchanges and reduced relevance in responses,…

Machine Learning · Computer Science 2025-06-24 Haseeb Ullah Khan Shinwari , Muhammad Usama

Memory is fundamental to intelligence, enabling learning, reasoning, and adaptability across biological and artificial systems. While Transformer architectures excel at sequence modeling, they face critical limitations in long-range context…

Machine Learning · Computer Science 2025-08-19 Parsa Omidi , Xingshuai Huang , Axel Laborieux , Bahareh Nikpour , Tianyu Shi , Armaghan Eshaghi

Conversational agents struggle to handle long conversations due to context window limitations. Therefore, memory systems are developed to leverage essential historical information. Existing memory systems typically follow a pipeline of…

Computation and Language · Computer Science 2026-01-30 Yimin Deng , Yuqing Fu , Derong Xu , Yejing Wang , Wei Ni , Jingtong Gao , Xiaopeng Li , Chengxu Liu , Xiao Han , Guoshuai Zhao , Xiangyu Zhao , Li Zhu , Xueming Qian

Transformer-based models have achieved state-of-the-art results in many natural language processing tasks. The self-attention architecture allows transformer to combine information from all elements of a sequence into context-aware…

Computation and Language · Computer Science 2021-02-17 Mikhail S. Burtsev , Yuri Kuratov , Anton Peganov , Grigory V. Sapunov

Transformers have revolutionized deep learning in numerous fields, including natural language processing, computer vision, and audio processing. Their strength lies in their attention mechanism, which allows for the discovering of complex…

Machine Learning · Computer Science 2024-04-02 Uladzislau Yorsh , Martin Holeňa , Ondřej Bojar , David Herel

With the rapid development of large language models, AI assistants like ChatGPT have become increasingly integrated into people's works and lives but are limited in personalized services. In this paper, we present a plug-and-play framework…

Computation and Language · Computer Science 2024-10-15 Ruifeng Yuan , Shichao Sun , Yongqi Li , Zili Wang , Ziqiang Cao , Wenjie Li

Pre-trained language models demonstrate general intelligence and common sense, but long inputs quickly become a bottleneck for memorizing information at inference time. We resurface a simple method, Memorizing Transformers (Wu et al.,…

Machine Learning · Computer Science 2024-06-05 Phoebe Klett , Thomas Ahle

Large Transformer models have achieved impressive performance in many natural language tasks. In particular, Transformer based language models have been shown to have great capabilities in encoding factual knowledge in their vast amount of…

Computation and Language · Computer Science 2020-12-02 Chen Zhu , Ankit Singh Rawat , Manzil Zaheer , Srinadh Bhojanapalli , Daliang Li , Felix Yu , Sanjiv Kumar

The computational complexity of the self-attention mechanism in Transformer models significantly limits their ability to generalize over long temporal durations. Memory-augmentation, or the explicit storing of past information in external…

Computation and Language · Computer Science 2022-11-29 Omri Raccah , Phoebe Chen , Ted L. Willke , David Poeppel , Vy A. Vo

Current Conversational AI systems employ different machine learning pipelines, as well as external knowledge sources and business logic to predict the next action. Maintaining various components in dialogue managers' pipeline adds…

Computation and Language · Computer Science 2024-04-15 Amin Hosseiny Marani , Ulie Schnaithmann , Youngseo Son , Akil Iyer , Manas Paldhe , Arushi Raghuvanshi

Transformer-based language models have achieved impressive success in various natural language processing tasks due to their ability to capture complex dependencies and contextual information using self-attention mechanisms. However, they…

Computation and Language · Computer Science 2023-06-26 Kaushik Roy , Yuxin Zi , Vignesh Narayanan , Manas Gaur , Amit Sheth

This paper studies interpretable and fair artificial intelligence architectures for understanding English reading. Introduced transformer-based models, integrating advanced attention mechanisms and gradient-based feature attribution. The…

Computation and Language · Computer Science 2026-04-28 Ping Li

The Transformer architecture has led to significant gains in machine translation. However, most studies focus on only sentence-level translation without considering the context dependency within documents, leading to the inadequacy of…

Artificial Intelligence · Computer Science 2022-10-21 Yukun Feng , Feng Li , Ziang Song , Boyuan Zheng , Philipp Koehn

Transformers are unable to model long-term memories effectively, since the amount of computation they need to perform grows with the context length. While variations of efficient transformers have been proposed, they all have a finite…

Computation and Language · Computer Science 2022-03-28 Pedro Henrique Martins , Zita Marinho , André F. T. Martins

Neural machine translation systems tend to fail on less decent inputs despite its significant efficacy, which may significantly harm the credibility of this systems-fathoming how and when neural-based systems fail in such cases is critical…

Computation and Language · Computer Science 2020-05-27 Wei Zou , Shujian Huang , Jun Xie , Xinyu Dai , Jiajun Chen

Recent advancements in large language models have demonstrated that extended inference through techniques can markedly improve performance, yet these gains come with increased computational costs and the propagation of inherent biases found…

Computation and Language · Computer Science 2025-02-10 Edward Hong Wang , Cynthia Xin Wen

Transformer-based models have become ubiquitous in natural language processing thanks to their large capacity, innate parallelism and high performance. The contextualizing component of a Transformer block is the $\textit{pairwise…

Machine Learning · Computer Science 2020-06-08 Ankit Gupta , Jonathan Berant

Large Language Models (LLMs) represent a landmark achievement in Artificial Intelligence (AI), demonstrating unprecedented proficiency in procedural tasks such as text generation, code completion, and conversational coherence. These…

Artificial Intelligence · Computer Science 2025-05-07 Schaun Wheeler , Olivier Jeunen

Transformer-based models have demonstrated their effectiveness in automatic speech recognition (ASR) tasks and even shown superior performance over the conventional hybrid framework. The main idea of Transformers is to capture the…

Sound · Computer Science 2022-07-05 Kun Wei , Pengcheng Guo , Ning Jiang
‹ Prev 1 2 3 10 Next ›