English
Related papers

Related papers: Qd-tree: Learning Data Layouts for Big Data Analyt…

200 papers

In this paper, we revisit the problem of indexing multi-dimensional data in memory for the efficient support of multi-dimensional range queries and nearest neighbor queries. This is a classic problem in main-memory databases, where there is…

Databases · Computer Science 2026-05-06 Achilleas Michalopoulos , Dimitrios Tsitsigkos , Nikos Mamoulis

Time-critical data aggregation in Internet of Things (IoT) networks demands efficient, collision-free scheduling to minimize latency for applications like smart cities and industrial automation. Traditional heuristic methods, with two-phase…

Networking and Internet Architecture · Computer Science 2025-11-25 Van-Vi Vo , Tien-Dung Nguyen , Duc-Tai Le , Hyunseung Choo

Index structures are one of the most important tools that DBAs leverage to improve the performance of analytics and transactional workloads. However, building several indexes over large datasets can often become prohibitive and consume…

Databases · Computer Science 2020-03-26 Alex Galakatos , Michael Markovitch , Carsten Binnig , Rodrigo Fonseca , Tim Kraska

Decision trees are one of the most useful and popular methods in the machine learning toolbox. In this paper, we consider the problem of learning optimal decision trees, a combinatorial optimization problem that is challenging to solve at…

Machine Learning · Computer Science 2022-07-01 Rahul Mazumder , Xiang Meng , Haoyue Wang

An optimal data partitioning in parallel & distributed implementation of clustering algorithms is a necessary computation as it ensures independent task completion, fair distribution, less number of affected points and better & faster…

Artificial Intelligence · Computer Science 2016-09-21 Saraswati Mishra , Avnish Chandra Suman

Modern data-driven applications require that databases support fast cross-model analytical queries. Achieving fast analytical queries in a database system is challenging since they are usually scan-intensive (i.e., they need to intensively…

Databases · Computer Science 2023-09-22 Jianfeng Huang , Dongjing Miao , Xin Liu

We consider the problem of laying out a tree with fixed parent/child structure in hierarchical memory. The goal is to minimize the expected number of block transfers performed during a search along a root-to-leaf path, subject to a given…

Data Structures and Algorithms · Computer Science 2007-05-23 Stephen Alstrup , Michael A. Bender , Erik D. Demaine , Martin Farach-Colton , Theis Rauhe , Mikkel Thorup

The Gradient Boosted Tree (GBT) algorithm is one of the most popular machine learning algorithms used in production, for tasks that include Click-Through Rate (CTR) prediction and learning-to-rank. To deal with the massive datasets…

Machine Learning · Computer Science 2019-05-30 Theodore Vasiloudis , Hyunsu Cho , Henrik Boström

This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of…

Artificial Intelligence · Computer Science 2009-09-25 A. Moore , M. S. Lee

Quality Diversity (QD) has shown great success in discovering high-performing, diverse policies for robot skill learning. While current benchmarks have led to the development of powerful QD methods, we argue that new paradigms must be…

Robotics · Computer Science 2024-07-26 Sumeet Batra , Bryon Tjanaka , Stefanos Nikolaidis , Gaurav Sukhatme

Although deep learning (DL) has already become a state-of-the-art technology for various data processing tasks, data security and computational overload problems often arise due to their high data and computational power dependency. To…

Quantum Physics · Physics 2022-04-08 Yunseok Kwak , Won Joon Yun , Jae Pyoung Kim , Hyunhee Cho , Minseok Choi , Soyi Jung , Joongheon Kim

As an emerging field, MS-based proteomics still requires software tools for efficiently storing and accessing experimental data. In this work, we focus on the management of LC-MS data, which are typically made available in standard…

Computational Engineering, Finance, and Science · Computer Science 2010-04-27 Sara Nasso , Francesco Silvestri , Francesco Tisiot , Barbara Di Camillo , Andrea Pietracaprina , Gianna Maria Toffolo

We initiate a study of a query-driven approach to designing partition trees for range-searching problems. Our model assumes that a data structure is to be built for an unknown query distribution that we can access through a sampling oracle,…

Data Structures and Algorithms · Computer Science 2025-02-20 Dimitris Fotakis , Andreas Kalavas , Ioannis Psarros

Cloud data lakes provide a modern solution for managing large volumes of data. The fundamental principle behind these systems is the separation of compute and storage layers. In this architecture, inexpensive cloud storage is utilized for…

Databases · Computer Science 2025-10-20 Gregory , Weintraub

Optimization tasks over relational data, such as clustering, often suffer from the prohibitive cost of join operations, which are necessary to access the full dataset. While geometric data structures like BBD trees yield fast approximation…

Databases · Computer Science 2026-03-13 Aryan Esmailpour , Stavros Sintos

Clustering is an important data mining technique that groups similar data records, recently categorical transaction clustering is received more attention. In this research, we study the problem of categorical data clustering for…

Databases · Computer Science 2017-05-03 Mahmoud Mahdi , Samir Abdelrahman , Reem Bahgat , Ismail Ismail

Streaming algorithms are fundamental in the analysis of large and online datasets. A key component of many such analytic tasks is $q$-MAX, which finds the largest $q$ values in a number stream. Modern approaches attain a constant runtime by…

Data Structures and Algorithms · Computer Science 2024-07-11 Ran Ben-Basat , Gil Einziger , Wenchen Han , Bilal Tayh

With the popularity of mobile devices and the development of geo-positioning technology, location-based services (LBS) attract much attention and top-k spatial keyword queries become increasingly complex. It is common to see that clients…

Data Structures and Algorithms · Computer Science 2022-07-26 Xinshi Zang , Peiwen Hao , Xiaofeng Gao , Bin Yao , Guihai Chen

The last years have seen a steep rise in data generation worldwide, with the development and widespread adoption of several software projects targeting the Big Data paradigm. Many companies currently engage in Big Data analytics as part of…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-05-25 Michele Ciavotta , Eugenio Gianniti , Danilo Ardagna

Scanning and filtering over multi-dimensional tables are key operations in modern analytical database engines. To optimize the performance of these operations, databases often create clustered indexes over a single dimension or…

Databases · Computer Science 2020-06-25 Vikram Nathan , Jialin Ding , Mohammad Alizadeh , Tim Kraska
‹ Prev 1 2 3 10 Next ›