Related papers: BitE : Accelerating Learned Query Optimization in …

HIRE: A Hybrid Learned Index for Robust and Efficient Performance under Mixed Workloads

Indexes are critical for efficient data retrieval and updates in modern databases. Recent advances in machine learning have led to the development of learned indexes, which model the cumulative distribution function of data to predict…

Databases · Computer Science 2026-04-27 Xinyi Zhang , Liang Liang , Anastasia Ailamaki , Jianliang Xu

Learned Query Superoptimization

Traditional query optimizers are designed to be fast and stateless: each query is quickly optimized using approximate statistics, sent off to the execution engine, and promptly forgotten. Recent work on learned query optimization have shown…

Databases · Computer Science 2023-07-12 Ryan Marcus

Spectral Clustering via Ensemble Deep Autoencoder Learning (SC-EDAE)

Recently, a number of works have studied clustering strategies that combine classical clustering algorithms and deep learning methods. These approaches follow either a sequential way, where a deep representation is learned using a deep…

Machine Learning · Computer Science 2019-06-13 Severine Affeldt , Lazhar Labiod , Mohamed Nadif

Neo: A Learned Query Optimizer

Query optimization is one of the most challenging problems in database systems. Despite the progress made over the past decades, query optimizers remain extremely complex components that require a great deal of hand-tuning for specific…

Databases · Computer Science 2020-04-09 Ryan Marcus , Parimarjan Negi , Hongzi Mao , Chi Zhang , Mohammad Alizadeh , Tim Kraska , Olga Papaemmanouil , Nesime Tatbul

ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement Learning

The performance of worst-case optimal join algorithms depends on the order in which the join attributes are processed. Selecting good orders before query execution is hard, due to the large space of possible orders and unreliable execution…

Databases · Computer Science 2023-08-01 Junxiong Wang , Immanuel Trummer , Ahmet Kara , Dan Olteanu

Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Mixture-of-Experts (MoE) models have shown remarkable capability in instruction tuning, especially when the number of tasks scales. However, previous methods simply merge all training tasks (e.g. creative writing, coding, and mathematics)…

Computation and Language · Computer Science 2024-06-18 Tong Zhu , Daize Dong , Xiaoye Qu , Jiacheng Ruan , Wenliang Chen , Yu Cheng

RELOAD: A Robust and Efficient Learned Query Optimizer for Database Systems

Recent advances in query optimization have shifted from traditional rule-based and cost-based techniques towards machine learning-driven approaches. Among these, reinforcement learning (RL) has attracted significant attention due to its…

Databases · Computer Science 2026-04-17 Seokwon Lee , Jaeyoung Sim , Sihyun Kim , Yuhsing Li , Yiwen Zhu , Kwanghyun Park

Towards a Hands-Free Query Optimizer through Deep Learning

Query optimization remains one of the most important and well-studied problems in database systems. However, traditional query optimizers are complex heuristically-driven systems, requiring large amounts of time to tune for a particular…

Databases · Computer Science 2018-12-19 Ryan Marcus , Olga Papaemmanouil

JOB-Complex: A Challenging Benchmark for Traditional & Learned Query Optimization

Query optimization is a fundamental task in database systems that is crucial to providing high performance. To evaluate learned and traditional optimizer's performance, several benchmarks, such as the widely used JOB benchmark, are used.…

Databases · Computer Science 2025-07-11 Johannes Wehrstein , Timo Eckmann , Roman Heinrich , Carsten Binnig

Learned Offline Query Planning via Bayesian Optimization

Analytics database workloads often contain queries that are executed repeatedly. Existing optimization techniques generally prioritize keeping optimization cost low, normally well below the time it takes to execute a single instance of a…

Databases · Computer Science 2025-02-11 Jeffrey Tao , Natalie Maus , Haydn Jones , Yimeng Zeng , Jacob R. Gardner , Ryan Marcus

Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation

Model-based deep reinforcement learning has achieved success in various domains that require high sample efficiencies, such as Go and robotics. However, there are some remaining issues, such as planning efficient explorations to learn more…

Machine Learning · Computer Science 2021-07-06 Yao Yao , Li Xiao , Zhicheng An , Wanpeng Zhang , Dijun Luo

LIME: Making LLM Data More Efficient with Linguistic Metadata Embeddings

Pre-training decoder-only language models relies on vast amounts of high-quality data, yet the availability of such data is increasingly reaching its limits. While metadata is commonly used to create and curate these datasets, its potential…

Computation and Language · Computer Science 2025-12-09 Sebastian Sztwiertnia , Felix Friedrich , Kristian Kersting , Patrick Schramowski , Björn Deiseroth

Buffer Pool Aware Query Scheduling via Deep Reinforcement Learning

In this extended abstract, we propose a new technique for query scheduling with the explicit goal of reducing disk reads and thus implicitly increasing query performance. We introduce SmartQueue, a learned scheduler that leverages…

Databases · Computer Science 2022-07-28 Chi Zhang , Ryan Marcus , Anat Kleiman , Olga Papaemmanouil

Context-Aware Ensemble Learning for Time Series

We investigate ensemble methods for prediction in an online setting. Unlike all the literature in ensembling, for the first time, we introduce a new approach using a meta learner that effectively combines the base model predictions via…

Machine Learning · Computer Science 2022-12-01 Arda Fazla , Mustafa Enes Aydin , Orhun Tamyigit , Suleyman Serdar Kozat

Beyond instruction-conditioning, MoTE: Mixture of Task Experts for Multi-task Embedding Models

Dense embeddings are fundamental to modern machine learning systems, powering Retrieval-Augmented Generation (RAG), information retrieval, and representation learning. While instruction-conditioning has become the dominant approach for…

Machine Learning · Computer Science 2025-06-24 Miguel Romero , Shuoyang Ding , Corey D. Barret , Georgiana Dinu , George Karypis

Assisted Learning for Organizations with Limited Imbalanced Data

In the era of big data, many big organizations are integrating machine learning into their work pipelines to facilitate data analysis. However, the performance of their trained models is often restricted by limited and imbalanced data…

Machine Learning · Computer Science 2024-03-05 Cheng Chen , Jiaying Zhou , Jie Ding , Yi Zhou

BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence

Retrieval-augmented large language models (LLMs) have demonstrated efficacy in knowledge-intensive tasks such as open-domain QA, addressing inherent challenges in knowledge update and factual inadequacy. However, inconsistencies between…

Computation and Language · Computer Science 2024-05-31 Jiajie Jin , Yutao Zhu , Yujia Zhou , Zhicheng Dou

Reconciling meta-learning and continual learning with online mixtures of tasks

Learning-to-learn or meta-learning leverages data-driven inductive bias to increase the efficiency of learning on a novel task. This approach encounters difficulty when transfer is not advantageous, for instance, when tasks are considerably…

Machine Learning · Computer Science 2019-06-20 Ghassen Jerfel , Erin Grant , Thomas L. Griffiths , Katherine Heller

Quantum Ensemble for Classification

A powerful way to improve performance in machine learning is to construct an ensemble that combines the predictions of multiple models. Ensemble methods are often much more accurate and lower variance than the individual classifiers that…

Machine Learning · Computer Science 2024-12-03 Antonio Macaluso , Luca Clissa , Stefano Lodi , Claudio Sartori

Iterative Amortized Inference: Unifying In-Context Learning and Learned Optimizers

Modern learning systems increasingly rely on amortized learning - the idea of reusing computation or inductive biases shared across tasks to enable rapid generalization to novel problems. This principle spans a range of approaches,…

Machine Learning · Computer Science 2025-10-14 Sarthak Mittal , Divyat Mahajan , Guillaume Lajoie , Mohammad Pezeshki