Related papers: Align then Train: Efficient Retrieval Adapter Lear…

DREditor: An Time-efficient Approach for Building a Domain-specific Dense Retrieval Model

Deploying dense retrieval models efficiently is becoming increasingly important across various industries. This is especially true for enterprise search services, where customizing search engines to meet the time demands of different…

Information Retrieval · Computer Science 2024-01-24 Chen Huang , Duanyu Feng , Wenqiang Lei , Jiancheng Lv

Ensembles of Low-Rank Expert Adapters

The training and fine-tuning of large language models (LLMs) often involve diverse textual data from multiple sources, which poses challenges due to conflicting gradient directions, hindering optimization and specialization. These…

Computation and Language · Computer Science 2025-02-04 Yinghao Li , Vianne Gao , Chao Zhang , MohamadAli Torkamani

RE-AdaptIR: Improving Information Retrieval through Reverse Engineered Adaptation

Large language models (LLMs) fine-tuned for text-retrieval have demonstrated state-of-the-art results across several information retrieval (IR) benchmarks. However, supervised training for improving these models requires numerous labeled…

Information Retrieval · Computer Science 2024-06-24 William Fleshman , Benjamin Van Durme

Large Reasoning Embedding Models: Towards Next-Generation Dense Retrieval Paradigm

In modern e-commerce search systems, dense retrieval has become an indispensable component. By computing similarities between query and item (product) embeddings, it efficiently selects candidate products from large-scale repositories. With…

Information Retrieval · Computer Science 2025-10-20 Jianting Tang , Dongshuai Li , Tao Wen , Fuyu Lv , Dan Ou , Linli Xu

LoRACode: LoRA Adapters for Code Embeddings

Code embeddings are essential for semantic code search; however, current approaches often struggle to capture the precise syntactic and contextual nuances inherent in code. Open-source models such as CodeBERT and UniXcoder exhibit…

Machine Learning · Computer Science 2025-06-03 Saumya Chaturvedi , Aman Chadha , Laurent Bindschaedler

DERA: Dense Entity Retrieval for Entity Alignment in Knowledge Graphs

Entity Alignment (EA) aims to match equivalent entities in different Knowledge Graphs (KGs), which is essential for knowledge fusion and integration. Recently, embedding-based EA has attracted significant attention and many approaches have…

Computation and Language · Computer Science 2024-08-05 Zhichun Wang , Xuan Chen

Learning To Retrieve: How to Train a Dense Retrieval Model Effectively and Efficiently

Ranking has always been one of the top concerns in information retrieval research. For decades, lexical matching signal has dominated the ad-hoc retrieval process, but it also has inherent defects, such as the vocabulary mismatch problem.…

Information Retrieval · Computer Science 2020-10-21 Jingtao Zhan , Jiaxin Mao , Yiqun Liu , Min Zhang , Shaoping Ma

ALER: An Active Learning Hybrid System for Efficient Entity Resolution

Entity Resolution (ER) is a critical task for data integration, yet state-of-the-art supervised deep learning models remain impractical for many real-world applications due to their need for massive, expensive-to-obtain labeled datasets.…

Databases · Computer Science 2026-01-29 Dimitrios Karapiperis , Leonidas Akritidis , Panayiotis Bozanis , Vassilios Verykios

Embedding-Based Context-Aware Reranker

Retrieval-Augmented Generation (RAG) systems rely on retrieving relevant evidence from a corpus to support downstream generation. The common practice of splitting a long document into multiple shorter passages enables finer-grained and…

Computation and Language · Computer Science 2026-02-26 Ye Yuan , Mohammad Amin Shabani , Siqi Liu

Efficiently Aligning Draft Models via Parameter- and Data-Efficient Adaptation

Speculative decoding accelerates LLM inference but suffers from performance degradation when target models are fine-tuned for specific domains. A naive solution is to retrain draft models for every target model, which is costly and…

Machine Learning · Computer Science 2026-03-11 Luxi Lin , Zhihang Lin , Zhanpeng Zeng , Yuhao Chen , Qingyu Zhang , Jixiang Luo , Xuelong Li , Rongrong Ji

ERA: Expert Retrieval and Assembly for Early Action Prediction

Early action prediction aims to successfully predict the class label of an action before it is completely performed. This is a challenging task because the beginning stages of different actions can be very similar, with only minor subtle…

Computer Vision and Pattern Recognition · Computer Science 2022-07-25 Lin Geng Foo , Tianjiao Li , Hossein Rahmani , Qiuhong Ke , Jun Liu

ELITE: Embedding-Less retrieval with Iterative Text Exploration

Large Language Models (LLMs) have achieved impressive progress in natural language processing, but their limited ability to retain long-term context constrains performance on document-level or multi-turn tasks. Retrieval-Augmented…

Computation and Language · Computer Science 2025-05-20 Zhangyu Wang , Siyuan Gao , Rong Zhou , Hao Wang , Li Ning

CLEAR: Cross-Lingual Enhancement in Alignment via Reverse-training

Existing multilingual embedding models often encounter challenges in cross-lingual scenarios due to imbalanced linguistic resources and less consideration of cross-lingual alignment during training. Although standardized contrastive…

Computation and Language · Computer Science 2026-04-15 Seungyoon Lee , Minhyuk Kim , Seongtae Hong , Youngjoon Jang , Dongsuk Oh , Heuiseok Lim

More Than Efficiency: Embedding Compression Improves Domain Adaptation in Dense Retrieval

Dense retrievers powered by pretrained embeddings are widely used for document retrieval but struggle in specialized domains due to the mismatches between the training and target domain distributions. Domain adaptation typically requires…

Information Retrieval · Computer Science 2026-01-21 Chunsheng Zuo , Daniel Khashabi

PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models

Embedding-based Retrieval Models (ERMs) have emerged as a promising framework for large-scale text retrieval problems due to powerful large language models. Nevertheless, fine-tuning ERMs to reach state-of-the-art results can be expensive…

Information Retrieval · Computer Science 2023-12-07 Wei-Cheng Chang , Jyun-Yu Jiang , Jiong Zhang , Mutasem Al-Darabsah , Choon Hui Teo , Cho-Jui Hsieh , Hsiang-Fu Yu , S. V. N. Vishwanathan

MAIR: A Massive Benchmark for Evaluating Instructed Retrieval

Recent information retrieval (IR) models are pre-trained and instruction-tuned on massive datasets and tasks, enabling them to perform well on a wide range of tasks and potentially generalize to unseen tasks with instructions. However,…

Information Retrieval · Computer Science 2024-10-15 Weiwei Sun , Zhengliang Shi , Jiulong Wu , Lingyong Yan , Xinyu Ma , Yiding Liu , Min Cao , Dawei Yin , Zhaochun Ren

Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases

Parameter efficient fine tuning methods like LoRA have enabled task specific adaptation of large language models, but efficiently composing multiple specialized adapters for unseen tasks remains challenging. We present a novel framework for…

Computation and Language · Computer Science 2026-02-26 Riya Adsul , Balachandra Devarangadi Sunil , Isha Nalawade , Sudharshan Govindan

Multi-hop Reasoning via Early Knowledge Alignment

Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for Large Language Models (LLMs) to address knowledge-intensive queries requiring domain-specific or up-to-date information. To handle complex multi-hop questions that…

Computation and Language · Computer Science 2026-01-05 Yuxin Wang , Shicheng Fang , Bo Wang , Qi Luo , Xuanjing Huang , Yining Zheng , Xipeng Qiu

EnterpriseEM: Fine-tuned Embeddings for Enterprise Semantic Search

Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract…

Information Retrieval · Computer Science 2025-12-08 Kamalkumar Rathinasamy , Jayarama Nettar , Amit Kumar , Vishal Manchanda , Arun Vijayakumar , Ayush Kataria , Venkateshprasanna Manjunath , Chidambaram GS , Jaskirat Singh Sodhi , Shoeb Shaikh , Wasim Akhtar Khan , Prashant Singh , Tanishq Dattatray Ige , Vipin Tiwari , Rajab Ali Mondal , Harshini K , S Reka , Chetana Amancharla , Faiz ur Rahman , Harikrishnan P A , Indraneel Saha , Bhavya Tiwary , Navin Shankar Patel , Pradeep T S , Balaji A J , Priyapravas , Mohammed Rafee Tarafdar

LIDER: An Efficient High-dimensional Learned Index for Large-scale Dense Passage Retrieval

Many recent approaches of passage retrieval are using dense embeddings generated from deep neural models, called "dense passage retrieval". The state-of-the-art end-to-end dense passage retrieval systems normally deploy a deep neural model…

Information Retrieval · Computer Science 2022-10-11 Yifan Wang , Haodi Ma , Daisy Zhe Wang