English
Related papers

Related papers: The Curse of Dense Low-Dimensional Information Ret…

200 papers

Nearly all implementations of top-$k$ retrieval with dense vector representations today take advantage of hierarchical navigable small-world network (HNSW) indexes. However, the generation of vector representations and efficiently searching…

Information Retrieval · Computer Science 2023-12-05 Jimmy Lin , Tommaso Teofili

Dense retrieval models are commonly used in Information Retrieval (IR) applications, such as Retrieval-Augmented Generation (RAG). Since they often serve as the first step in these systems, their robustness is critical to avoid downstream…

Computation and Language · Computer Science 2025-06-04 Mohsen Fayyaz , Ali Modarressi , Hinrich Schuetze , Nanyun Peng

In recent years, dense retrieval has been the focus of information retrieval (IR) research. While effective, dense retrieval produces uninterpretable dense vectors, and suffers from the drawback of large index size. Learned sparse retrieval…

Information Retrieval · Computer Science 2025-11-10 Zhichao Xu , Aosong Feng , Yijun Tian , Haibo Ding , Lin Lee Cheong

The semantic matching capabilities of neural information retrieval can ameliorate synonymy and polysemy problems of symbolic approaches. However, neural models' dense representations are more suitable for re-ranking, due to their…

Computation and Language · Computer Science 2021-10-18 Kyoung-Rok Jang , Junmo Kang , Giwon Hong , Sung-Hyon Myaeng , Joohee Park , Taewon Yoon , Heecheol Seo

Industry-scale recommender systems face a core challenge: representing entities with high cardinality, such as users or items, using dense embeddings that must be accessible during both training and inference. However, as embedding sizes…

Information Retrieval · Computer Science 2025-05-19 Petr Kasalický , Martin Spišák , Vojtěch Vančura , Daniel Bohuněk , Rodrigo Alves , Pavel Kordík

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be…

Computation and Language · Computer Science 2020-10-02 Vladimir Karpukhin , Barlas Oğuz , Sewon Min , Patrick Lewis , Ledell Wu , Sergey Edunov , Danqi Chen , Wen-tau Yih

Despite the advantages of their low-resource settings, traditional sparse retrievers depend on exact matching approaches between high-dimensional bag-of-words (BoW) representations of both the queries and the collection. As a result,…

Information Retrieval · Computer Science 2024-04-16 Dahlia Shehata

Learned sparse and dense representations capture different successful approaches to text retrieval and the fusion of their results has proven to be more effective and robust. Prior work combines dense and sparse retrievers by fusing their…

Information Retrieval · Computer Science 2021-12-10 Sheng-Chieh Lin , Jimmy Lin

Multimodal representations that enable cross-modal retrieval are widely used. However, these often lack interpretability making it difficult to explain the retrieved results. Solutions such as learning sparse disentangled representations…

Information Retrieval · Computer Science 2025-06-25 Prachi J , Sumit Bhatia , Srikanta Bedathur

High-dimensional dense embeddings have become central to modern Information Retrieval, but many dimensions are noisy or redundant. Recently proposed DIME (Dimension IMportance Estimation), provides query-dependent scores to identify…

Information Retrieval · Computer Science 2026-04-13 Giulio D'Erasmo , Cesare Campagnano , Antonio Mallia , Pierpaolo Brutti , Nicola Tonellotto , Fabrizio Silvestri

Recently, the retrieval models based on dense representations have been gradually applied in the first stage of the document retrieval tasks, showing better performance than traditional sparse vector space models. To obtain high efficiency,…

Information Retrieval · Computer Science 2021-08-20 Hongyin Tang , Xingwu Sun , Beihong Jin , Jingang Wang , Fuzheng Zhang , Wei Wu

Interpretability benefits the theoretical understanding of representations. Existing word embeddings are generally dense representations. Hence, the meaning of latent dimensions is difficult to interpret. This makes word embeddings like a…

Computation and Language · Computer Science 2023-06-27 Minxue Xia , Hao Zhu

Dense retrieval, which encodes queries and documents into a single dense vector, has become the dominant neural retrieval approach due to its simplicity and compatibility with fast approximate nearest neighbor algorithms. As the tasks dense…

Information Retrieval · Computer Science 2026-02-06 Julian Killingback , Mahta Rafiee , Madine Manas , Hamed Zamani

Neural information retrieval architectures based on transformers such as BERT are able to significantly improve system effectiveness over traditional sparse models such as BM25. Though highly effective, these neural approaches are very…

Information Retrieval · Computer Science 2022-04-26 Antonio Mallia , Joel Mackenzie , Torsten Suel , Nicola Tonellotto

Neural retrieval models (NRMs) have been shown to outperform their statistical counterparts owing to their ability to capture semantic meaning via dense document representations. These models, however, suffer from poor interpretability as…

Information Retrieval · Computer Science 2023-04-26 Michael Llordes , Debasis Ganguly , Sumit Bhatia , Chirag Agarwal

While dense retrieval models, which embed queries and documents into a shared low-dimensional space, have gained widespread popularity, they were shown to exhibit important theoretical limitations and considerably lag behind traditional…

Information Retrieval · Computer Science 2026-04-09 Adrian Bracher , Svitlana Vakulenko

This study investigates the position bias in information retrieval, where models tend to overemphasize content at the beginning of passages while neglecting semantically relevant information that appears later. To analyze the extent and…

Information Retrieval · Computer Science 2025-09-19 Ziyang Zeng , Dun Zhang , Jiacheng Li , Panxiang Zou , Yudong Zhou , Yuqing Yang

Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors, scoring each document by its inner product with the query. We investigate the capacity of this architecture relative to sparse bag-of-words…

Computation and Language · Computer Science 2021-02-18 Yi Luan , Jacob Eisenstein , Kristina Toutanova , Michael Collins

Recent advances in Information Retrieval have leveraged high-dimensional embedding spaces to improve the retrieval of relevant documents. Moreover, the Manifold Clustering Hypothesis suggests that despite these high-dimensional…

Information Retrieval · Computer Science 2024-12-20 Giulio D'Erasmo , Giovanni Trappolini , Nicola Tonellotto , Fabrizio Silvestri

Ranking has always been one of the top concerns in information retrieval research. For decades, lexical matching signal has dominated the ad-hoc retrieval process, but it also has inherent defects, such as the vocabulary mismatch problem.…

Information Retrieval · Computer Science 2020-10-21 Jingtao Zhan , Jiaxin Mao , Yiqun Liu , Min Zhang , Shaoping Ma
‹ Prev 1 2 3 10 Next ›