Related papers: Manu: A Cloud Native Vector Database Management Sy…
There are now over 20 commercial vector database management systems (VDBMSs), all produced within the past five years. But embedding-based retrieval has been studied for over ten years, and similarity search a staggering half century and…
Vector database management systems have emerged as an important component in modern data management, driven by the growing importance for the need to computationally describe rich data such as texts, images and video in various domains such…
The rapid growth of Large Language Models (LLMs) and AI-driven applications has propelled Vector Database Management Systems (VDBMSs) into the spotlight as a critical infrastructure component. VDBMS specializes in storing, indexing, and…
As high-dimensional vector data increasingly surpasses the processing capabilities of traditional database management systems, Vector Databases (VDBs) have emerged and become tightly integrated with large language models, being widely…
Modern deep learning models capture the semantics of complex data by transforming them into high-dimensional embedding vectors. Emerging applications, such as retrieval-augmented generation, use approximate nearest neighbor (ANN) search in…
Vector database management systems (VDBMSs) play a crucial role in facilitating semantic similarity searches over high-dimensional embeddings from diverse data sources. While VDBMSs are widely used in applications such as recommendation,…
Vector search (VS) has become a fundamental component in multimodal data management, enabling core functionalities such as image, video, and code retrieval. As vector data scales rapidly, VS faces growing challenges in balancing search,…
Vector data is prevalent across business and scientific applications, and its popularity is growing with the proliferation of learned embeddings. Vector data collections often reach billions of vectors with thousands of dimensions, thus,…
The rapid adoption of AI-powered applications demands high-performance, scalable, and efficient cloud database solutions, as traditional architectures often struggle with AI-driven workloads requiring real-time data access, vector search,…
Vector search has been widely employed in recommender system and retrieval-augmented-generation pipelines, commonly performed with vector indexes to efficiently find similar items in large datasets. Recent growths in both data and task…
Embedding-based vector search underpins many important applications, such as recommendation and retrieval-augmented generation (RAG). It relies on vector indices to enable efficient search. However, these indices require storing…
The rapid growth of machine learning capabilities and the adoption of data processing methods using vector embeddings sparked a great interest in creating systems for vector data management. While the predominant approach of vector data…
Vector databases have become a cornerstone of modern information retrieval, powering applications in recommendation, search, and retrieval-augmented generation (RAG) pipelines. However, scaling approximate nearest neighbor (ANN) search to…
Vector search and database systems have become a keystone component in many AI applications. While many prior research has investigated how to accelerate the performance of generic vector search, emerging AI applications require running…
Getting the best performance from the ever-increasing number of hardware platforms has been a recurring challenge for data processing systems. In recent years, the advent of data science with its increasingly numerous and complex types of…
Graphical user interfaces (GUIs) are the primary medium for human-computer interaction, yet automating GUI interactions remains challenging due to the complexity of visual elements, dynamic environments, and the need for multi-step…
The recent advances in virtualization technology have enabled the sharing of computing and networking resources of cloud data centers among multiple users. Virtual Network Embedding (VNE) is highly important and is an integral part of the…
We apply distributed language embedding methods from Natural Language Processing to assign a vector to each database entity associated token (for example, a token may be a word occurring in a table row, or the name of a column). These…
Vector retrieval systems exhibit significant performance variance across queries due to heterogeneous embedding quality. We propose a lightweight framework for predicting retrieval performance at the query level by combining quantization…
There is an increasing demand for extending existing DBMSs with vector indices so that they become unified systems capable of supporting modern predictive applications, which require joint querying of vector embeddings together with the…