English
Related papers

Related papers: QVCache: A Query-Aware Vector Cache

200 papers

Vector search, the task of finding the k-nearest neighbors of a query vector against a database of high-dimensional vectors, underpins many machine learning applications, including retrieval-augmented generation, recommendation systems, and…

Modern deep learning models capture the semantics of complex data by transforming them into high-dimensional embedding vectors. Emerging applications, such as retrieval-augmented generation, use approximate nearest neighbor (ANN) search in…

Databases · Computer Science 2025-10-01 Guoyu Hu , Shaofeng Cai , Tien Tuan Anh Dinh , Zhongle Xie , Cong Yue , Gang Chen , Beng Chin Ooi

Graph-based high-dimensional vector indices have become a mainstream solution for large-scale approximate nearest neighbor search (ANNS). However, their substantial memory footprint often requires storage on secondary devices, where…

Databases · Computer Science 2025-08-22 Yijie Zhou , Shengyuan Lin , Shufeng Gong , Song Yu , Shuhao Fan , Yanfeng Zhang , Ge Yu

Large-scale approximate nearest neighbor search (ANN) has been gaining attention along with the latest machine learning researches employing ANNs. If the data is too large to fit in memory, it is necessary to search for the most similar…

Machine Learning · Computer Science 2025-01-29 Taiga Ikeda , Daisuke Miyashita , Jun Deguchi

As the field of Large Language Models (LLMs) continues to evolve, the context length in inference is steadily growing. Key-Value Cache (KVCache), the intermediate representations of tokens within LLM inference, has now become the primary…

Computation and Language · Computer Science 2025-04-01 Hailin Zhang , Xiaodong Ji , Yilin Chen , Fangcheng Fu , Xupeng Miao , Xiaonan Nie , Weipeng Chen , Bin Cui

Vector search underpins modern AI applications by supporting approximate nearest neighbor (ANN) queries over high-dimensional embeddings in tasks like retrieval-augmented generation (RAG), recommendation systems, and multimodal search.…

Databases · Computer Science 2026-05-19 Shurui Zhong , Dingheng Mo , Siqiang Luo

Billion-scale high-dimensional approximate nearest neighbour (ANN) search has become an important problem for searching similar objects among the vast amount of images and videos available online. The existing ANN methods are usually…

Computer Vision and Pattern Recognition · Computer Science 2019-07-30 Wei Chen , Jincai Chen , Fuhao Zou , Yuan-Fang Li , Ping Lu , Qiang Wang , Wei Zhao

Vector similarity search presents significant challenges in terms of scalability for large and high-dimensional datasets, as well as in providing native support for hybrid queries. Serverless computing and cloud functions offer attractive…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-04 Joe Oakley , Hakan Ferhatosmanoglu

Semantic caches return cached responses for semantically similar prompts to reduce LLM inference latency and cost. They embed cached prompts and store them alongside their response in a vector database. Embedding similarity metrics assign a…

Vector search (VS) has become a fundamental component in multimodal data management, enabling core functionalities such as image, video, and code retrieval. As vector data scales rapidly, VS faces growing challenges in balancing search,…

Databases · Computer Science 2026-01-06 Yitong Song , Xuanhe Zhou , Christian S. Jensen , Jianliang Xu

Vector search and database systems have become a keystone component in many AI applications. While many prior research has investigated how to accelerate the performance of generic vector search, emerging AI applications require running…

Databases · Computer Science 2025-06-03 Jingyi Xi , Chenghao Mo , Benjamin Karsin , Artem Chirkin , Mingqin Li , Minjia Zhang

Vector search underpins modern information-retrieval systems, including retrieval-augmented generation (RAG) pipelines and search engines over unstructured text and images. As datasets scale to billions of vectors, disk-based vector search…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-07 Nam Anh Dang , Ben Landrum , Ken Birman

Vector approximate nearest neighbor search (ANNS) underpins search engines, recommendation systems, and advertising services. Recent advances in ANNS indexes make CPU a cost-effective choice for serving million-scale, in-memory vector…

Information Retrieval · Computer Science 2026-05-12 Yuchen Huang , Baiteng Ma , Yiping Sun , Yang Shi , Xiao Chen , Xiaocheng Zhong , Zhiyong Wang , Yao Hu , Chuliang Weng

Embedding-based vector search underpins many important applications, such as recommendation and retrieval-augmented generation (RAG). It relies on vector indices to enable efficient search. However, these indices require storing…

Approximate nearest neighbor search (ANNS) at billion scale is fundamentally an out-of-core problem: vectors and indexes live on SSD, so performance is dominated by I/O rather than compute. Under skewed semantic embeddings, existing…

Embedding models capture both semantic and syntactic structures of queries, often mapping different queries to similar regions in vector space. This results in non-uniform cluster access patterns in modern disk-based vector databases. While…

Databases · Computer Science 2025-09-24 Yeonwoo Jeong , Hyunji Cho , Kyuri Park , Youngjae Kim , Sungyong Park

Graph-based approximate nearest neighbor search (ANNS) methods (e.g., HNSW) have become the de facto state of the art for their high precision and low latency. To scale beyond main memory, recent out-of-memory ANNS systems leverage SSDs to…

Databases · Computer Science 2026-02-27 Weichen Zhao , Yuncheng Lu , Yao Tian , Hao Zhang , Jiehui Li , Minghao Zhao , Yakun Li , Weining Qian

Approximate Nearest Neighbor Search (ANNS) is now widely used in various applications, ranging from information retrieval, question answering, and recommendation, to search for similar high-dimensional vectors. As the amount of vector data…

Information Retrieval · Computer Science 2024-10-21 Yuming Xu , Hengyu Liang , Jin Li , Shuotao Xu , Qi Chen , Qianxi Zhang , Cheng Li , Ziyue Yang , Fan Yang , Yuqing Yang , Peng Cheng , Mao Yang

This paper presents VLCache, a cache reuse framework that exploits both Key-Value (KV) cache and encoder cache from prior multimodal inputs to eliminate costly recomputation when the same multimodal inputs recur. Unlike previous heuristic…

Computer Vision and Pattern Recognition · Computer Science 2025-12-19 Shengling Qin , Hao Yu , Chenxin Wu , Zheng Li , Yizhong Cao , Zhengyang Zhuge , Yuxin Zhou , Wentao Yao , Yi Zhang , Zhengheng Wang , Shuai Bai , Jianwei Zhang , Junyang Lin

Nearest neighbour search over dense vector collections has important applications in information retrieval, retrieval augmented generation (RAG), and content ranking. Performing efficient search over large vector collections is a well…

‹ Prev 1 2 3 10 Next ›