Related papers: Learning Multi-dimensional Indexes
A learned multi-dimensional index is a data structure that efficiently answers multi-dimensional orthogonal queries by understanding the data distribution using machine learning models. One of the existing problems is that the search…
Filtering data based on predicates is one of the most fundamental operations for any modern data warehouse. Techniques to accelerate the execution of filter expressions include clustered indexes, specialized sort orders (e.g., Z-order),…
Spatial data is ubiquitous. Massive amounts of data are generated every day from billions of GPS-enabled devices such as cell phones, cars, sensors, and various consumer-based applications such as Uber, Tinder, location-tagged posts in…
Indexes can significantly improve search performance in relational databases. However, if the query workload changes frequently or new data updates occur continuously, it may not be worthwhile to build a conventional index upfront for query…
Multidimensional data are becoming more prevalent, partly due to the rise of the Internet of Things (IoT), and with that the need to ingest and analyze data streams at rates higher than before. Some industrial IoT applications require…
The objective of this paper is to design novel multi-layer neural network architectures for multiscale simulations of flows taking into account the observed data and physical modeling concepts. Our approaches use deep learning concepts…
A recent research trend involves treating database index structures as Machine Learning (ML) models. In this domain, single or multiple ML models are trained to learn the mapping from keys to positions inside a data set. This class of…
Clustering algorithms partition a dataset into groups of similar points. The clustering problem is very general, and different partitions of the same dataset could be considered correct and useful. To fully understand such data, it must be…
The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1]. This study explores the application of streaming machine learning [3] to revolutionize data prefetching within…
Learning from the multidimensional data has been an interesting concept in the field of machine learning. However, such learning can be difficult, complex, expensive because of expensive data processing, manipulations as the number of…
Indexing is an effective way to support efficient query processing in large databases. Recently the concept of learned index, which replaces or complements traditional index structures with machine learning models, has been actively…
Datalog-based languages are regaining popularity as a powerful abstraction for expressing recursive computations in domains such as program analysis and graph processing. However, existing systems often face a trade-off between efficiency…
In this paper, we revisit the problem of indexing multi-dimensional data in memory for the efficient support of multi-dimensional range queries and nearest neighbor queries. This is a classic problem in main-memory databases, where there is…
Indexing large-scale databases in main memory is still challenging today. Learned index structures -- in which the core components of classical indexes are replaced with machine learning models -- have recently been suggested to…
Along with climate change, more frequent extreme events, such as flooding and tropical cyclones, threaten the livelihoods and wellbeing of poor and vulnerable populations. One of the most immediate needs of people affected by a disaster is…
Efficient indexing is fundamental for multi-dimensional data management and analytics. An emerging tendency is to directly learn the storage layout of multi-dimensional data by simple machine learning models, yielding the concept of Learned…
This paper presents a novel high speed clustering scheme for high dimensional data streams. Data stream clustering has gained importance in different applications, for example, in network monitoring, intrusion detection, and real-time…
We study the problem of self-supervised 3D scene flow estimation from real large-scale raw point cloud sequences, which is crucial to various tasks like trajectory prediction or instance segmentation. In the absence of ground truth scene…
Recent advances in query optimization have shifted from traditional rule-based and cost-based techniques towards machine learning-driven approaches. Among these, reinforcement learning (RL) has attracted significant attention due to its…
Database research can help machine learning performance in many ways. One way is to design better data structures. This paper combines the use of incremental computation and sequential and probabilistic filtering to enable "forgetful"…