Related papers: Manu: A Cloud Native Vector Database Management Sy…

Survey of Vector Database Management Systems

There are now over 20 commercial vector database management systems (VDBMSs), all produced within the past five years. But embedding-based retrieval has been studied for over ten years, and similarity search a staggering half century and…

Databases · Computer Science 2023-10-24 James Jie Pan , Jianguo Wang , Guoliang Li

Vector database management systems: Fundamental concepts, use-cases, and current challenges

Vector database management systems have emerged as an important component in modern data management, driven by the growing importance for the need to computationally describe rich data such as texts, images and video in various domains such…

Databases · Computer Science 2025-01-15 Toni Taipalus

Towards Reliable Vector Database Management Systems: A Software Testing Roadmap for 2030

The rapid growth of Large Language Models (LLMs) and AI-driven applications has propelled Vector Database Management Systems (VDBMSs) into the spotlight as a critical infrastructure component. VDBMS specializes in storing, indexing, and…

Software Engineering · Computer Science 2025-03-03 Shenao Wang , Yanjie Zhao , Yinglin Xie , Zhao Liu , Xinyi Hou , Quanchen Zou , Haoyu Wang

A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge

As high-dimensional vector data increasingly surpasses the processing capabilities of traditional database management systems, Vector Databases (VDBs) have emerged and become tightly integrated with large language models, being widely…

Databases · Computer Science 2026-03-27 Le Ma , Ran Zhang , Yikun Han , Shirui Yu , Zaitian Wang , Zhiyuan Ning , Jinghan Zhang , Ping Xu , Pengjiang Li , Ziyue Qiao , Wei Ju , Chong Chen , Dongjie Wang , Kunpeng Liu , Pengyang Wang , Pengfei Wang , Yanjie Fu , Chunjiang Liu , Yuanchun Zhou , Chang-Tien Lu

HAKES: Scalable Vector Database for Embedding Search Service

Modern deep learning models capture the semantics of complex data by transforming them into high-dimensional embedding vectors. Emerging applications, such as retrieval-augmented generation, use approximate nearest neighbor (ANN) search in…

Databases · Computer Science 2025-10-01 Guoyu Hu , Shaofeng Cai , Tien Tuan Anh Dinh , Zhongle Xie , Cong Yue , Gang Chen , Beng Chin Ooi

Toward Understanding Bugs in Vector Database Management Systems

Vector database management systems (VDBMSs) play a crucial role in facilitating semantic similarity searches over high-dimensional embeddings from diverse data sources. While VDBMSs are widely used in applications such as recommendation,…

Software Engineering · Computer Science 2025-06-04 Yinglin Xie , Xinyi Hou , Yanjie Zhao , Shenao Wang , Kai Chen , Haoyu Wang

Vector Search for the Future: From Memory-Resident, Static Heterogeneous Storage, to Cloud-Native Architectures

Vector search (VS) has become a fundamental component in multimodal data management, enabling core functionalities such as image, video, and code retrieval. As vector data scales rapidly, VS faces growing challenges in balancing search,…

Databases · Computer Science 2026-01-06 Yitong Song , Xuanhe Zhou , Christian S. Jensen , Jianliang Xu

Toward Efficient and Scalable Design of In-Memory Graph-Based Vector Search

Vector data is prevalent across business and scientific applications, and its popularity is growing with the proliferation of learned embeddings. Vector data collections often reach billions of vectors with thousands of dimensions, thus,…

Information Retrieval · Computer Science 2025-09-09 Ilias Azizi , Karima Echihab , Themis Palpanas , Vassilis Christophides

Building Scalable AI-Powered Applications with Cloud Databases: Architectures, Best Practices and Performance Considerations

The rapid adoption of AI-powered applications demands high-performance, scalable, and efficient cloud database solutions, as traditional architectures often struggle with AI-driven workloads requiring real-time data access, vector search,…

Databases · Computer Science 2025-05-06 Santosh Bhupathi

Cloud-Native Vector Search: A Comprehensive Performance Analysis

Vector search has been widely employed in recommender system and retrieval-augmented-generation pipelines, commonly performed with vector indexes to efficiently find similar items in large datasets. Recent growths in both data and task…

Databases · Computer Science 2025-12-08 Zhaoheng Li , Wei Ding , Silu Huang , Zikang Wang , Yuanjin Lin , Ke Wu , Yongjoo Park , Jianjun Chen

LEANN: A Low-Storage Vector Index

Embedding-based vector search underpins many important applications, such as recommendation and retrieval-augmented generation (RAG). It relies on vector indices to enable efficient search. However, these indices require storing…

Databases · Computer Science 2025-11-26 Yichuan Wang , Zhifei Li , Shu Liu , Yongji Wu , Ziming Mao , Yilong Zhao , Xiao Yan , Zhiying Xu , Yang Zhou , Ion Stoica , Sewon Min , Matei Zaharia , Joseph E. Gonzalez

Efficient Data Access Paths for Mixed Vector-Relational Search

The rapid growth of machine learning capabilities and the adoption of data processing methods using vector embeddings sparked a great interest in creating systems for vector data management. While the predominant approach of vector data…

Databases · Computer Science 2024-03-26 Viktor Sanca , Anastasia Ailamaki

QVCache: A Query-Aware Vector Cache

Vector databases have become a cornerstone of modern information retrieval, powering applications in recommendation, search, and retrieval-augmented generation (RAG) pipelines. However, scaling approximate nearest neighbor (ANN) search to…

Databases · Computer Science 2026-02-03 Anıl Eren Göçer , Ioanna Tsakalidou , Hamish Nicholson , Kyoungmin Kim , Anastasia Ailamaki

VecFlow: A High-Performance Vector Data Management System for Filtered-Search on GPUs

Vector search and database systems have become a keystone component in many AI applications. While many prior research has investigated how to accelerate the performance of generic vector search, emerging AI applications require running…

Databases · Computer Science 2025-06-03 Jingyi Xi , Chenghao Mo , Benjamin Karsin , Artem Chirkin , Mingqin Li , Minjia Zhang

The Collection Virtual Machine: An Abstraction for Multi-Frontend Multi-Backend Data Analysis

Getting the best performance from the ever-increasing number of hardware platforms has been a recurring challenge for data processing systems. In recent years, the advent of data science with its increasingly numerous and complex types of…

Databases · Computer Science 2020-04-10 Ingo Müller , Renato Marroquín , Dimitrios Koutsoukos , Mike Wawrzoniak , Sabir Akhadov , Gustavo Alonso

Mano Technical Report

Graphical user interfaces (GUIs) are the primary medium for human-computer interaction, yet automating GUI interactions remains challenging due to the complexity of visual elements, dynamic environments, and the need for multi-step…

Multimedia · Computer Science 2025-11-03 Tianyu Fu , Anyang Su , Chenxu Zhao , Hanning Wang , Minghui Wu , Zhe Yu , Fei Hu , Mingjia Shi , Wei Dong , Jiayao Wang , Yuyang Chen , Ruiyang Yu , Siran Peng , Menglin Li , Nan Huang , Haitian Wei , Jiawei Yu , Yi Xin , Xilin Zhao , Kai Gu , Ping Jiang , Sifan Zhou , Shuo Wang

MUVINE: Multi-stage Virtual Network Embedding in Cloud Data Centers using Reinforcement Learning based Predictions

The recent advances in virtualization technology have enabled the sharing of computing and networking resources of cloud data centers among multiple users. Virtual Network Embedding (VNE) is highly important and is an integral part of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-05 Hiren Kumar Thakkar , Chinmaya Kumar Dehury , Prasan Kumar Sahoo

Enabling Cognitive Intelligence Queries in Relational Databases using Low-dimensional Word Embeddings

We apply distributed language embedding methods from Natural Language Processing to assign a vector to each database entity associated token (for example, a token may be a word occurring in a table row, or the name of a column). These…

Computation and Language · Computer Science 2016-03-24 Rajesh Bordawekar , Oded Shmueli

Semantic Certainty Assessment in Vector Retrieval Systems: A Novel Framework for Embedding Quality Evaluation

Vector retrieval systems exhibit significant performance variance across queries due to heterogeneous embedding quality. We propose a lightweight framework for predicting retrieval performance at the query level by combining quantization…

Information Retrieval · Computer Science 2025-07-09 Y. Du

NaviX: A Native Vector Index Design for Graph DBMSs With Robust Predicate-Agnostic Search Performance

There is an increasing demand for extending existing DBMSs with vector indices so that they become unified systems capable of supporting modern predictive applications, which require joint querying of vector embeddings together with the…

Information Retrieval · Computer Science 2025-07-01 Gaurav Sehgal , Semih Salihoglu