Related papers: Query-driven Segment Selection for Ranking Long Do…

The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

On a wide range of natural language processing and information retrieval tasks, transformer-based models, particularly pre-trained language models like BERT, have demonstrated tremendous effectiveness. Due to the quadratic complexity of the…

Information Retrieval · Computer Science 2022-10-18 Minghan Li , Diana Nicoleta Popa , Johan Chagnon , Yagmur Gizem Cinar , Eric Gaussier

Hierarchical Transformers for Long Document Classification

BERT, which stands for Bidirectional Encoder Representations from Transformers, is a recently introduced language representation model based upon the transfer learning paradigm. We extend its fine-tuning procedure to address one of its…

Computation and Language · Computer Science 2019-10-25 Raghavendra Pappagari , Piotr Żelasko , Jesús Villalba , Yishay Carmiel , Najim Dehak

Long Document Ranking with Query-Directed Sparse Transformer

The computing cost of transformer self-attention often necessitates breaking long documents to fit in pretrained models in document ranking tasks. In this paper, we design Query-Directed Sparse attention that induces IR-axiomatic structures…

Artificial Intelligence · Computer Science 2020-10-27 Jyun-Yu Jiang , Chenyan Xiong , Chia-Jung Lee , Wei Wang

Pretrained Transformers for Text Ranking: BERT and Beyond

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural…

Information Retrieval · Computer Science 2021-08-20 Jimmy Lin , Rodrigo Nogueira , Andrew Yates

Document Ranking with a Pretrained Sequence-to-Sequence Model

This work proposes a novel adaptation of a pretrained sequence-to-sequence model to the task of document ranking. Our approach is fundamentally different from a commonly-adopted classification-based formulation of ranking, based on…

Information Retrieval · Computer Science 2020-03-17 Rodrigo Nogueira , Zhiying Jiang , Jimmy Lin

Input-length-shortening and text generation via attention values

Identifying words that impact a task's performance more than others is a challenge in natural language processing. Transformers models have recently addressed this issue by incorporating an attention mechanism that assigns greater attention…

Computation and Language · Computer Science 2023-03-15 Neşet Özkan Tan , Alex Yuxuan Peng , Joshua Bensemann , Qiming Bao , Tim Hartill , Mark Gahegan , Michael Witbrock

Transformer Based Language Models for Similar Text Retrieval and Ranking

Most approaches for similar text retrieval and ranking with long natural language queries rely at some level on queries and responses having words in common with each other. Recent applications of transformer-based neural language models to…

Information Retrieval · Computer Science 2020-05-22 Javed Qadrud-Din , Ashraf Bah Rabiou , Ryan Walker , Ravi Soni , Martin Gajek , Gabriel Pack , Akhil Rangaraj

The Surprising Effectiveness of Rankers Trained on Expanded Queries

An important problem in text-ranking systems is handling the hard queries that form the tail end of the query distribution. The difficulty may arise due to the presence of uncommon, underspecified, or incomplete queries. In this work, we…

Information Retrieval · Computer Science 2024-06-13 Abhijit Anand , Venktesh V , Vinay Setty , Avishek Anand

Improving Transformer Based Line Segment Detection with Matched Predicting and Re-ranking

Classical Transformer-based line segment detection methods have delivered impressive results. However, we observe that some accurately detected line segments are assigned low confidence scores during prediction, causing them to be ranked…

Computer Vision and Pattern Recognition · Computer Science 2025-02-26 Xin Tong , Shi Peng , Baojie Tian , Yufei Guo , Xuhui Huang , Zhe Ma

Efficient and Effective Query Context-Aware Learning-to-Rank Model for Sequential Recommendation

Modern sequential recommender systems commonly use transformer-based models for next-item prediction. While these models demonstrate a strong balance between efficiency and quality, integrating interleaving features - such as the query…

Information Retrieval · Computer Science 2025-08-13 Andrii Dzhoha , Alisa Mironenko , Evgeny Labzin , Vladimir Vlasov , Maarten Versteegh , Marjan Celikik

Query-Based Keyphrase Extraction from Long Documents

Transformer-based architectures in natural language processing force input size limits that can be problematic when long documents need to be processed. This paper overcomes this issue for keyphrase extraction by chunking the long documents…

Computation and Language · Computer Science 2022-05-12 Martin Docekal , Pavel Smrz

Efficient Classification of Long Documents Using Transformers

Several methods have been proposed for classifying long textual documents using Transformers. However, there is a lack of consensus on a benchmark to enable a fair comparison among different approaches. In this paper, we provide a…

Computation and Language · Computer Science 2022-03-23 Hyunji Hayley Park , Yogarshi Vyas , Kashif Shah

Improving the Efficiency of Long Document Classification using Sentence Ranking Approach

Long document classification poses challenges due to the computational limitations of transformer-based models, particularly BERT, which are constrained by fixed input lengths and quadratic attention complexity. Moreover, using the full…

Computation and Language · Computer Science 2025-06-24 Prathamesh Kokate , Mitali Sarnaik , Manavi Khopade , Raviraj Joshi

BERTSel: Answer Selection with Pre-trained Models

Recently, pre-trained models have been the dominant paradigm in natural language processing. They achieved remarkable state-of-the-art performance across a wide range of related tasks, such as textual entailment, natural language inference,…

Computation and Language · Computer Science 2019-05-21 Dongfang Li , Yifei Yu , Qingcai Chen , Xinyu Li

An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Pre-trained contextual language models such as BERT, GPT, and XLnet work quite well for document retrieval tasks. Such models are fine-tuned based on the query-document/query-passage level relevance labels to capture the ranking signals.…

Information Retrieval · Computer Science 2023-12-07 Koustav Rudra , Zeon Trevor Fernando , Avishek Anand

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are targeted towards token- and sentence-level…

Computation and Language · Computer Science 2020-05-21 Arman Cohan , Sergey Feldman , Iz Beltagy , Doug Downey , Daniel S. Weld

Hierarchical Neural Network Approaches for Long Document Classification

Text classification algorithms investigate the intricate relationships between words or phrases and attempt to deduce the document's interpretation. In the last few years, these algorithms have progressed tremendously. Transformer…

Computation and Language · Computer Science 2022-06-28 Snehal Khandve , Vedangi Wagh , Apurva Wani , Isha Joshi , Raviraj Joshi

BERT Rankers are Brittle: a Study using Adversarial Document Perturbations

Contextual ranking models based on BERT are now well established for a wide range of passage and document ranking tasks. However, the robustness of BERT-based ranking models under adversarial inputs is under-explored. In this paper, we…

Information Retrieval · Computer Science 2022-06-24 Yumeng Wang , Lijun Lyu , Avishek Anand

How Different are Pre-trained Transformers for Text Ranking?

In recent years, large pre-trained transformers have led to substantial gains in performance over traditional retrieval models and feedback approaches. However, these results are primarily based on the MS Marco/TREC Deep Learning Track…

Information Retrieval · Computer Science 2022-04-18 David Rau , Jaap Kamps

Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search

Medical systematic reviews typically require assessing all the documents retrieved by a search. The reason is two-fold: the task aims for ``total recall''; and documents retrieved using Boolean search are an unordered set, and thus it is…

Information Retrieval · Computer Science 2022-12-20 Shuai Wang , Harrisen Scells , Bevan Koopman , Guido Zuccon