Related papers: Efficiency Optimizations for Superblock-based Spar…

Dynamic Superblock Pruning for Fast Learned Sparse Retrieval

This paper proposes superblock pruning (SP) during top-k online document retrieval for learned sparse representations. SP structures the sparse index as a set of superblocks on a sequence of document blocks and conducts a superblock-level…

Information Retrieval · Computer Science 2026-02-04 Parker Carlson , Wentai Xie , Shanxiu He , Tao Yang

A Unified Framework for Learned Sparse Retrieval

Learned sparse retrieval (LSR) is a family of first-stage retrieval methods that are trained to generate sparse lexical representations of queries and documents for use with an inverted index. Many LSR methods have been recently introduced,…

Information Retrieval · Computer Science 2023-03-28 Thong Nguyen , Sean MacAvaney , Andrew Yates

Effective Inference-Free Retrieval for Learned Sparse Representations

Learned Sparse Retrieval (LSR) is an effective IR approach that exploits pre-trained language models for encoding text into a learned bag of words. Several efforts in the literature have shown that sparsity is key to enabling a good…

Information Retrieval · Computer Science 2025-05-06 Franco Maria Nardini , Thong Nguyen , Cosimo Rulli , Rossano Venturini , Andrew Yates

CSPLADE: Learned Sparse Retrieval with Causal Language Models

In recent years, dense retrieval has been the focus of information retrieval (IR) research. While effective, dense retrieval produces uninterpretable dense vectors, and suffers from the drawback of large index size. Learned sparse retrieval…

Information Retrieval · Computer Science 2025-11-10 Zhichao Xu , Aosong Feng , Yijun Tian , Haibo Ding , Lin Lee Cheong

On the Challenges and Opportunities of Learned Sparse Retrieval for Code

Retrieval over large codebases is a key component of modern LLM-based software engineering systems. Existing approaches predominantly rely on dense embedding models, while learned sparse retrieval (LSR) remains largely unexplored for code.…

Information Retrieval · Computer Science 2026-03-24 Simon Lupart , Maxime Louis , Thibault Formal , Hervé Déjean , Stéphane Clinchant

Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control

Learned sparse retrieval (LSR) is a family of neural methods that encode queries and documents into sparse lexical vectors that can be indexed and retrieved efficiently with an inverted index. We explore the application of LSR to the…

Information Retrieval · Computer Science 2024-02-28 Thong Nguyen , Mariya Hendriksen , Andrew Yates , Maarten de Rijke

Faster Learned Sparse Retrieval with Block-Max Pruning

Learned sparse retrieval systems aim to combine the effectiveness of contextualized language models with the scalability of conventional data structures such as inverted indexes. Nevertheless, the indexes generated by these systems exhibit…

Information Retrieval · Computer Science 2024-05-03 Antonio Mallia , Torten Suel , Nicola Tonellotto

Approximate Cluster-Based Sparse Document Retrieval with Segmented Maximum Term Weights

This paper revisits cluster-based retrieval that partitions the inverted index into multiple groups and skips the index partially at cluster and document levels during online inference using a learned sparse representation. It proposes an…

Information Retrieval · Computer Science 2024-04-16 Yifan Qiao , Shanxiu He , Yingrui Yang , Parker Carlson , Tao Yang

The Role of Vocabularies in Learning Sparse Representations for Ranking

Learned Sparse Retrieval (LSR) such as SPLADE has growing interest for effective semantic 1st stage matching while enjoying the efficiency of inverted indices. A recent work on learning SPLADE models with expanded vocabularies (ESPLADE) was…

Information Retrieval · Computer Science 2026-04-21 Hiun Kim , Tae Kwan Lee , Taeryun Won

A Static Pruning Study on Sparse Neural Retrievers

Sparse neural retrievers, such as DeepImpact, uniCOIL and SPLADE, have been introduced recently as an efficient and effective way to perform retrieval with inverted indexes. They aim to learn term importance and, in some cases, document…

Information Retrieval · Computer Science 2023-04-26 Carlos Lassance , Simon Lupart , Hervé Dejean , Stéphane Clinchant , Nicola Tonellotto

No More K-means:Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However, this granularity imposes prohibitive storage and retrieval…

Information Retrieval · Computer Science 2026-05-29 Lixuan Guo , Yifei Wang , Tiansheng Wen , Aosong Feng , Stefanie Jegelka , Chenyu You

Selection of Supervised Learning-based Sparse Matrix Reordering Algorithms

Sparse matrix ordering is a vital optimization technique often employed for solving large-scale sparse matrices. Its goal is to minimize the matrix bandwidth by reorganizing its rows and columns, thus enhancing efficiency. Conventional…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-14 Tao Tang , Youfu Jiang , Yingbo Cui , Jianbin Fang , Peng Zhang , Lin Peng , Chun Huang

Improved Learned Sparse Retrieval with Corpus-Specific Vocabularies

We explore leveraging corpus-specific vocabularies that improve both efficiency and effectiveness of learned sparse retrieval systems. We find that pre-training the underlying BERT model on the target corpus, specifically targeting…

Information Retrieval · Computer Science 2024-01-15 Puxuan Yu , Antonio Mallia , Matthias Petri

A Block Decomposition Algorithm for Sparse Optimization

Sparse optimization is a central problem in machine learning and computer vision. However, this problem is inherently NP-hard and thus difficult to solve in general. Combinatorial search methods find the global optimal solution but are…

Optimization and Control · Mathematics 2020-06-30 Ganzhao Yuan , Li Shen , Wei-Shi Zheng

Sparse Linear Regression With Missing Data

This paper proposes a fast and accurate method for sparse regression in the presence of missing data. The underlying statistical model encapsulates the low-dimensional structure of the incomplete data matrix and the sparsity of the…

Machine Learning · Statistics 2015-03-31 Ravi Ganti , Rebecca M. Willett

Structured Sparsity Learning for Efficient Video Super-Resolution

The high computational costs of video super-resolution (VSR) models hinder their deployment on resource-limited devices, (e.g., smartphones and drones). Existing VSR models contain considerable redundant filters, which drag down the…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Bin Xia , Jingwen He , Yulun Zhang , Yitong Wang , Yapeng Tian , Wenming Yang , Luc Van Gool

A Low-Power Sparse Deep Learning Accelerator with Optimized Data Reuse

Sparse deep learning has reduced computation significantly, but its irregular non-zero data distribution complicates the data flow and hinders data reuse, increasing on-chip SRAM access and thus power consumption of the chip. This paper…

Hardware Architecture · Computer Science 2025-03-26 Kai-Chieh Hsu , Tian-Sheuan Chang

SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

In neural Information Retrieval, ongoing research is directed towards improving the first retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using efficient approximate nearest neighbors methods has proven to…

Information Retrieval · Computer Science 2021-07-14 Thibault Formal , Benjamin Piwowarski , Stéphane Clinchant

LSTM-based Selective Dense Text Retrieval Guided by Sparse Lexical Retrieval

This paper studies fast fusion of dense retrieval and sparse lexical retrieval, and proposes a cluster-based selective dense retrieval method called CluSD guided by sparse lexical retrieval. CluSD takes a lightweight cluster-based approach…

Information Retrieval · Computer Science 2025-02-18 Yingrui Yang , Parker Carlson , Yifan Qiao , Wentai Xie , Shanxiu He , Tao Yang

Predicting Efficiency/Effectiveness Trade-offs for Dense vs. Sparse Retrieval Strategy Selection

Over the last few years, contextualized pre-trained transformer models such as BERT have provided substantial improvements on information retrieval tasks. Recent approaches based on pre-trained transformer models such as BERT, fine-tune…

Information Retrieval · Computer Science 2021-09-23 Negar Arabzadeh , Xinyi Yan , Charles L. A. Clarke