English
Related papers

Related papers: Progressively Optimized Bi-Granular Document Repre…

200 papers

Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or…

Information Retrieval · Computer Science 2023-02-20 Yukang Gan , Yixiao Ge , Chang Zhou , Shupeng Su , Zhouchuan Xu , Xuyuan Xu , Quanchao Hui , Xiang Chen , Yexin Wang , Ying Shan

Embedding-based retrieval aims to learn a shared semantic representation space for both queries and items, enabling efficient and effective item retrieval through approximate nearest neighbor (ANN) algorithms. In current industrial…

Information Retrieval · Computer Science 2025-10-14 Han Zhang , Yunjiang Jiang , Mingming Li , Haowei Yuan , Yiming Qiu , Wen-Yun Yang

Learned sparse representations form an attractive class of contextual embeddings for text retrieval. That is so because they are effective models of relevance and are interpretable by design. Despite their apparent compatibility with…

Information Retrieval · Computer Science 2024-07-15 Sebastian Bruch , Franco Maria Nardini , Cosimo Rulli , Rossano Venturini

Dense retrieval, which describes the use of contextualised language models such as BERT to identify documents from a collection by leveraging approximate nearest neighbour (ANN) techniques, has been increasing in popularity. Two families of…

Information Retrieval · Computer Science 2021-08-27 Craig Macdonald , Nicola Tonellotto

Embedding-based retrieval (EBR) methods are widely used in modern recommender systems thanks to its simplicity and effectiveness. However, along the journey of deploying and iterating on EBR in production, we still identify some fundamental…

Information Retrieval · Computer Science 2023-02-07 Yuan Zhang , Xue Dong , Weijie Ding , Biao Li , Peng Jiang , Kun Gai

Retrieval, the initial stage of a recommendation system, is tasked with down-selecting items from a pool of tens of millions of candidates to a few thousands. Embedding Based Retrieval (EBR) has been a typical choice for this problem,…

Generative Retrieval (GR), autoregressively decoding relevant document identifiers given a query, has been shown to perform well under the setting of small-scale corpora. By memorizing the document corpus with model parameters, GR…

Information Retrieval · Computer Science 2024-01-22 Peiwen Yuan , Xinglin Wang , Shaoxiong Feng , Boyuan Pan , Yiwei Li , Heda Wang , Xupeng Miao , Kan Li

Industry-scale recommender systems face a core challenge: representing entities with high cardinality, such as users or items, using dense embeddings that must be accessible during both training and inference. However, as embedding sizes…

Information Retrieval · Computer Science 2025-05-19 Petr Kasalický , Martin Spišák , Vojtěch Vančura , Daniel Bohuněk , Rodrigo Alves , Pavel Kordík

Embedding based retrieval (EBR) is a fundamental building block in many web applications. However, EBR in sponsored search is distinguished from other generic scenarios and technically challenging due to the need of serving multiple…

Dense embedding models are commonly deployed in commercial search engines, wherein all the document vectors are pre-computed, and near-neighbor search (NNS) is performed with the query vector to find relevant documents. However, the…

Machine Learning · Computer Science 2020-09-01 Tharun Medini , Beidi Chen , Anshumali Shrivastava

ANNS for embedded vector representations of texts is commonly used in information retrieval, with two important information representations being sparse and dense vectors. While it has been shown that combining these representations…

Information Retrieval · Computer Science 2024-10-29 Haoyu Zhang , Jun Liu , Zhenhua Zhu , Shulin Zeng , Maojia Sheng , Tao Yang , Guohao Dai , Yu Wang

Dense retrieval conducts text retrieval in the embedding space and has shown many advantages compared to sparse retrieval. Existing dense retrievers optimize representations of queries and documents with contrastive training and map them to…

Information Retrieval · Computer Science 2021-07-19 Yizhi Li , Zhenghao Liu , Chenyan Xiong , Zhiyuan Liu

Recently, the retrieval models based on dense representations have been gradually applied in the first stage of the document retrieval tasks, showing better performance than traditional sparse vector space models. To obtain high efficiency,…

Information Retrieval · Computer Science 2021-08-20 Hongyin Tang , Xingwu Sun , Beihong Jin , Jingang Wang , Fuzheng Zhang , Wei Wu

Document retrieval enables users to find their required documents accurately and quickly. To satisfy the requirement of retrieval efficiency, prevalent deep neural methods adopt a representation-based matching paradigm, which saves online…

Information Retrieval · Computer Science 2022-07-12 Mengxue Du , Shasha Li , Jie Yu , Jun Ma , Bin Ji , Huijun Liu , Wuhang Lin , Zibo Yi

Recent advances in dense retrieval techniques have offered the promise of being able not just to re-rank documents using contextualised language models such as BERT, but also to use such models to identify documents from the collection in…

Information Retrieval · Computer Science 2021-08-25 Nicola Tonellotto , Craig Macdonald

Embedding models have become essential tools in both natural language processing and computer vision, enabling efficient semantic search, recommendation, clustering, and more. However, the high memory and computational demands of…

Computation and Language · Computer Science 2024-11-26 Jiayi Chen , Chen Wu , Shaoqun Zhang , Nan Li , Liangjie Zhang , Qi Zhang

Sparse document representations have been widely used to retrieve relevant documents via exact lexical matching. Owing to the pre-computed inverted index, it supports fast ad-hoc search but incurs the vocabulary mismatch problem. Although…

Information Retrieval · Computer Science 2023-10-06 Eunseong Choi , Sunkyung Lee , Minjin Choi , Hyeseon Ko , Young-In Song , Jongwuk Lee

Learning vectorized embeddings is fundamental to many recommender systems for user-item matching. To enable efficient online inference, representation binarization, which embeds latent features into compact binary sequences, has recently…

Information Retrieval · Computer Science 2025-06-04 Yankai Chen , Yue Que , Xinni Zhang , Chen Ma , Irwin King

Interpretability in black-box dense retrievers remains a central challenge in Retrieval-Augmented Generation (RAG). Understanding how queries and documents semantically interact is critical for diagnosing retrieval behavior and improving…

Information Retrieval · Computer Science 2026-01-29 Yash Saxena , Ankur Padia , Kalpa Gunaratna , Manas Gaur

Information retrieval involves selecting artifacts from a corpus that are most relevant to a given search query. The flavor of retrieval typically used in classical applications can be termed as homogeneous and relaxed, where queries and…

Information Retrieval · Computer Science 2023-10-10 Anirudh Khatry , Yasharth Bajpai , Priyanshu Gupta , Sumit Gulwani , Ashish Tiwari
‹ Prev 1 2 3 10 Next ›