English
Related papers

Related papers: Length-Aware Multi-Kernel Transformer for Long Doc…

200 papers

Transformer architectures are increasingly effective at processing and generating very long chunks of texts, opening new perspectives for document-level machine translation (MT). In this work, we challenge the ability of MT systems to…

Computation and Language · Computer Science 2025-04-29 Ziqian Peng , Rachel Bawden , François Yvon

Text classification is an area of research which has been studied over the years in Natural Language Processing (NLP). Adapting NLP to multiple domains has introduced many new challenges for text classification and one of them is long…

Computation and Language · Computer Science 2023-07-20 Damith Premasiri , Tharindu Ranasinghe , Ruslan Mitkov

Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to…

Several methods have been proposed for classifying long textual documents using Transformers. However, there is a lack of consensus on a benchmark to enable a fair comparison among different approaches. In this paper, we provide a…

Computation and Language · Computer Science 2022-03-23 Hyunji Hayley Park , Yogarshi Vyas , Kashif Shah

The most widely used large language models in the social sciences (such as BERT, and its derivatives, e.g. RoBERTa) have a limitation on the input text length that they can process to produce predictions. This is a particularly pressing…

Computation and Language · Computer Science 2025-09-30 Miklós Sebők , Viktor Kovács , Martin Bánóczy , Daniel Møller Eriksen , Nathalie Neptune , Philippe Roussille

Text classification algorithms investigate the intricate relationships between words or phrases and attempt to deduce the document's interpretation. In the last few years, these algorithms have progressed tremendously. Transformer…

Computation and Language · Computer Science 2022-06-28 Snehal Khandve , Vedangi Wagh , Apurva Wani , Isha Joshi , Raviraj Joshi

Transformer-based models, specifically BERT, have propelled research in various NLP tasks. However, these models are limited to a maximum token limit of 512 tokens. Consequently, this makes it non-trivial to apply it in a practical setting…

Computation and Language · Computer Science 2023-11-01 Aman Jaiswal , Evangelos Milios

Recent advances in the area of long document matching have primarily focused on using transformer-based models for long document encoding and matching. There are two primary challenges associated with these models. Firstly, the performance…

Computation and Language · Computer Science 2023-02-09 Akshita Jha , Adithya Samavedhi , Vineeth Rakesh , Jaideep Chandrashekar , Chandan K. Reddy

Recent advancements in Large Language Models (LLMs) have pushed the boundaries of natural language processing, especially in long-context understanding. However, the evaluation of these models' long-context abilities remains a challenge due…

Computation and Language · Computer Science 2025-04-24 Cunxiang Wang , Ruoxi Ning , Boqi Pan , Tonghui Wu , Qipeng Guo , Cheng Deng , Guangsheng Bao , Xiangkun Hu , Zheng Zhang , Qian Wang , Yue Zhang

Despite several successes in document understanding, the practical task for long document understanding is largely under-explored due to several challenges in computation and how to efficiently absorb long multimodal input. Most current…

Computation and Language · Computer Science 2022-08-18 Hai Pham , Guoxin Wang , Yijuan Lu , Dinei Florencio , Cha Zhang

Long-context modeling is becoming a core capability of modern large vision-language models (LVLMs), enabling sustained context management across long-document understanding, video analysis, and multi-turn tool use in agentic workflows. Yet…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Zhaowei Wang , Lishu Luo , Haodong Duan , Weiwei Liu , Sijin Wu , Ji Luo , Shen Yan , Shuai Peng , Sihang Yuan , Chaoyi Huang , Yi Lin , Yangqiu Song

Many natural language processing and information retrieval problems can be formalized as the task of semantic matching. Existing work in this area has been largely focused on matching between short texts (e.g., question answering), or…

Information Retrieval · Computer Science 2021-05-07 Liu Yang , Mingyang Zhang , Cheng Li , Michael Bendersky , Marc Najork

Although transformer architectures have achieved state-of-the-art performance across diverse domains, their quadratic computational complexity with respect to sequence length remains a significant bottleneck, particularly for…

Computation and Language · Computer Science 2025-11-05 Zeyu Liu , Souvik Kundu , Lianghao Jiang , Anni Li , Srikanth Ronanki , Sravan Bodapati , Gourav Datta , Peter A. Beerel

Multimodal Large Language Models (MLLMs) have achieved great success in Speech-to-Text Translation (S2TT) tasks. However, current research is constrained by two key challenges: language coverage and efficiency. Most of the popular S2TT…

Computation and Language · Computer Science 2026-04-14 Yexing Du , Kaiyuan Liu , Youcheng Pan , Bo Yang , Keqi Deng , Xie Chen , Yang Xiang , Ming Liu , Bing Qin , YaoWei Wang

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and…

Computation and Language · Computer Science 2020-01-06 Goran Glavaš , Swapna Somasundaran

Objective: Clinical knowledge enriched transformer models (e.g., ClinicalBERT) have state-of-the-art results on clinical NLP (natural language processing) tasks. One of the core limitations of these transformer models is the substantial…

Computation and Language · Computer Science 2023-01-30 Yikuan Li , Ramsey M. Wehbe , Faraz S. Ahmad , Hanyin Wang , Yuan Luo

Pre-trained Transformers currently dominate most NLP tasks. They impose, however, limits on the maximum input length (512 sub-words in BERT), which are too restrictive in the legal domain. Even sparse-attention models, such as Longformer…

Computation and Language · Computer Science 2022-11-11 Dimitris Mamakas , Petros Tsotsi , Ion Androutsopoulos , Ilias Chalkidis

Long document classification presents challenges in capturing both local and global dependencies due to their extensive content and complex structure. Existing methods often struggle with token limits and fail to adequately model…

Computation and Language · Computer Science 2024-10-07 Sudipta Singha Roy , Xindi Wang , Robert E. Mercer , Frank Rudzicz

Large, pre-trained transformer models like BERT have achieved state-of-the-art results on document understanding tasks, but most implementations can only consider 512 tokens at a time. For many real-world applications, documents can be much…

Computation and Language · Computer Science 2021-07-20 Allison Hegel , Marina Shah , Genevieve Peaslee , Brendan Roof , Emad Elwany

Non-hierarchical sparse attention Transformer-based models, such as Longformer and Big Bird, are popular approaches to working with long documents. There are clear benefits to these approaches compared to the original Transformer in terms…

Computation and Language · Computer Science 2022-10-12 Ilias Chalkidis , Xiang Dai , Manos Fergadiotis , Prodromos Malakasiotis , Desmond Elliott
‹ Prev 1 2 3 10 Next ›