Related papers: Binary Embedding-based Retrieval at Tencent

pEBR: A Probabilistic Approach to Embedding Based Retrieval

Embedding-based retrieval aims to learn a shared semantic representation space for both queries and items, enabling efficient and effective item retrieval through approximate nearest neighbor (ANN) algorithms. In current industrial…

Information Retrieval · Computer Science 2025-10-14 Han Zhang , Yunjiang Jiang , Mingming Li , Haowei Yuan , Yiming Qiu , Wen-Yun Yang

Divide and Conquer: Towards Better Embedding-based Retrieval for Recommender Systems From a Multi-task Perspective

Embedding-based retrieval (EBR) methods are widely used in modern recommender systems thanks to its simplicity and effectiveness. However, along the journey of deploying and iterating on EBR in production, we still identify some fundamental…

Information Retrieval · Computer Science 2023-02-07 Yuan Zhang , Xue Dong , Weijie Ding , Biao Li , Peng Jiang , Kun Gai

Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors

Rapid advances in GPU hardware and multiple areas of Deep Learning open up a new opportunity for billion-scale information retrieval with exhaustive search. Building on top of the powerful concept of semantic learning, this paper proposes a…

Information Retrieval · Computer Science 2018-02-20 Ying Shan , Jian Jiao , Jie Zhu , JC Mao

CPS-MEBR: Click Feedback-Aware Web Page Summarization for Multi-Embedding-Based Retrieval

Embedding-based retrieval (EBR) is a technique to use embeddings to represent query and document, and then convert the retrieval problem into a nearest neighbor search problem in the embedding space. Some previous works have mainly focused…

Information Retrieval · Computer Science 2023-05-09 Wenbiao Li , Pan Tang , Zhengfan Wu , Weixue Lu , Minghua Zhang , Zhenlei Tian , Daiting Shi , Yu Sun , Simiu Gu , Dawei Yin

Embedding-based Retrieval in Multimodal Content Moderation

Video understanding plays a fundamental role for content moderation on short video platforms, enabling the detection of inappropriate content. While classification remains the dominant approach for content moderation, it often struggles in…

Information Retrieval · Computer Science 2025-07-03 Hanzhong Liang , Jinghao Shi , Xiang Shen , Zixuan Wang , Vera Wen , Ardalan Mehrani , Zhiqian Chen , Yifan Wu , Zhixin Zhang

Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

Ad-hoc search calls for the selection of appropriate answers from a massive-scale corpus. Nowadays, the embedding-based retrieval (EBR) becomes a promising solution, where deep learning based document representation and ANN search…

Information Retrieval · Computer Science 2022-03-03 Shitao Xiao , Zheng Liu , Weihao Han , Jianjin Zhang , Yingxia Shao , Defu Lian , Chaozhuo Li , Hao Sun , Denvy Deng , Liangjie Zhang , Qi Zhang , Xing Xie

Efficient Document Retrieval by End-to-End Refining and Quantizing BERT Embedding with Contrastive Product Quantization

Efficient document retrieval heavily relies on the technique of semantic hashing, which learns a binary code for every document and employs Hamming distance to evaluate document distances. However, existing semantic hashing methods are…

Information Retrieval · Computer Science 2022-11-01 Zexuan Qiu , Qinliang Su , Jianxing Yu , Shijing Si

Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in Bing Sponsored Search

Embedding based retrieval (EBR) is a fundamental building block in many web applications. However, EBR in sponsored search is distinguished from other generic scenarios and technically challenging due to the need of serving multiple…

Information Retrieval · Computer Science 2022-02-15 Jianjin Zhang , Zheng Liu , Weihao Han , Shitao Xiao , Ruicheng Zheng , Yingxia Shao , Hao Sun , Hanqing Zhu , Premkumar Srinivasan , Denvy Deng , Qi Zhang , Xing Xie

Learning Multi-Stage Multi-Grained Semantic Embeddings for E-Commerce Search

Retrieving relevant items that match users' queries from billion-scale corpus forms the core of industrial e-commerce search systems, in which embedding-based retrieval (EBR) methods are prevailing. These methods adopt a two-tower framework…

Information Retrieval · Computer Science 2023-03-21 Binbin Wang , Mingming Li , Zhixiong Zeng , Jingwei Zhuo , Songlin Wang , Sulong Xu , Bo Long , Weipeng Yan

Search Efficient Binary Network Embedding

Traditional network embedding primarily focuses on learning a continuous vector representation for each node, preserving network structure and/or node content information, such that off-the-shelf machine learning algorithms can be easily…

Social and Information Networks · Computer Science 2023-01-02 Daokun Zhang , Jie Yin , Xingquan Zhu , Chengqi Zhang

Embedding-based Retrieval in Facebook Search

Search in social networks such as Facebook poses different challenges than in classical web search: besides the query text, it is important to take into account the searcher's context to provide relevant results. Their social graph is an…

Information Retrieval · Computer Science 2020-07-31 Jui-Ting Huang , Ashish Sharma , Shuying Sun , Li Xia , David Zhang , Philip Pronin , Janani Padmanabhan , Giuseppe Ottaviano , Linjun Yang

An Efficient Embedding Based Ad Retrieval with GPU-Powered Feature Interaction

In large-scale advertising recommendation systems, retrieval serves as a critical component, aiming to efficiently select a subset of candidate ads relevant to user behaviors from a massive ad inventory for subsequent ranking and…

Machine Learning · Computer Science 2025-12-29 Yifan Lei , Jiahua Luo , Tingyu Jiang , Bo Zhang , Lifeng Wang , Dapeng Liu , Zhaoren Wu , Haijie Gu , Huan Yu , Jie Jiang

Binary Code based Hash Embedding for Web-scale Applications

Nowadays, deep learning models are widely adopted in web-scale applications such as recommender systems, and online advertising. In these applications, embedding learning of categorical features is crucial to the success of deep learning…

Information Retrieval · Computer Science 2021-09-07 Bencheng Yan , Pengjie Wang , Jinquan Liu , Wei Lin , Kuang-Chih Lee , Jian Xu , Bo Zheng

Event-enhanced Retrieval in Real-time Search

The embedding-based retrieval (EBR) approach is widely used in mainstream search engine retrieval systems and is crucial in recent retrieval-augmented methods for eliminating LLM illusions. However, existing EBR models often face the…

Computation and Language · Computer Science 2024-04-10 Yanan Zhang , Xiaoling Bai , Tianhua Zhou

Enhancing Relevance of Embedding-based Retrieval at Walmart

Embedding-based neural retrieval (EBR) is an effective search retrieval method in product search for tackling the vocabulary gap between customer search queries and products. The initial launch of our EBR system at Walmart yielded…

Information Retrieval · Computer Science 2024-08-16 Juexin Lin , Sachin Yadav , Feng Liu , Nicholas Rossi , Praveen R. Suram , Satya Chembolu , Prijith Chandran , Hrushikesh Mohapatra , Tony Lee , Alessandro Magnani , Ciya Liao

Composite Re-Ranking for Efficient Document Search with BERT

Although considerable efforts have been devoted to transformer-based ranking models for document search, the relevance-efficiency tradeoff remains a critical problem for ad-hoc ranking. To overcome this challenge, this paper presents BECR…

Information Retrieval · Computer Science 2022-01-07 Yingrui Yang , Yifan Qiao , Jinjin Shao , Mayuresh Anand , Xifeng Yan , Tao Yang

Hierarchical Structured Neural Network: Efficient Retrieval Scaling for Large Scale Recommendation

Retrieval, the initial stage of a recommendation system, is tasked with down-selecting items from a pool of tens of millions of candidates to a few thousands. Embedding Based Retrieval (EBR) has been a typical choice for this problem,…

Information Retrieval · Computer Science 2025-01-10 Kaushik Rangadurai , Siyang Yuan , Minhui Huang , Yiqun Liu , Golnaz Ghasemiesfeh , Yunchen Pu , Haiyu Lu , Xingfeng He , Fangzhou Xu , Andrew Cui , Vidhoon Viswanathan , Lin Yang , Liang Wang , Jiyan Yang , Chonglin Sun

Near-lossless Binarization of Word Embeddings

Word embeddings are commonly used as a starting point in many NLP models to achieve state-of-the-art performances. However, with a large vocabulary and many dimensions, these floating-point representations are expensive both in terms of…

Computation and Language · Computer Science 2020-01-23 Julien Tissier , Christophe Gravier , Amaury Habrard

Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Model Coefficients

Kernel approximation is widely used to scale up kernel SVM training and prediction. However, the memory and computation costs of kernel approximation models are still too high if we want to deploy them on memory-limited devices such as…

Machine Learning · Computer Science 2020-10-07 Zijian Lei , Liang Lan

E2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker

Text embedding models serve as a fundamental component in real-world search applications. By mapping queries and documents into a shared embedding space, they deliver competitive retrieval performance with high efficiency. However, their…

Computation and Language · Computer Science 2025-11-03 Qi Liu , Yanzhao Zhang , Mingxin Li , Dingkun Long , Pengjun Xie , Jiaxin Mao