Related papers: Optimizing Machine Learning Inference Queries with…

Exploiting Correlations for Expensive Predicate Evaluation

User Defined Function(UDFs) are used increasingly to augment query languages with extra, application dependent functionality. Selection queries involving UDF predicates tend to be expensive, either in terms of monetary cost or latency. In…

Databases · Computer Science 2014-11-14 Manas Joglekar , Hector Garcia-Molina , Aditya Parameswaran , Christopher Re

Progressive Query Expansion for Retrieval Over Cost-constrained Data Sources

Query expansion has been employed for a long time to improve the accuracy of query retrievers. Earlier works relied on pseudo-relevance feedback (PRF) techniques, which augment a query with terms extracted from documents retrieved in a…

Information Retrieval · Computer Science 2024-06-12 Muhammad Shihab Rashid , Jannat Ara Meem , Yue Dong , Vagelis Hristidis

PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning

Offline imitation learning (offline IL) enables training effective policies without requiring explicit reward annotations. Recent approaches attempt to estimate rewards for unlabeled datasets using a small set of expert demonstrations.…

Machine Learning · Computer Science 2025-11-19 Shengjie Sun , Jiafei Lyu , Runze Liu , Mengbei Yan , Bo Liu , Deheng Ye , Xiu Li

CORE: Automatic Molecule Optimization Using Copy & Refine Strategy

Molecule optimization is about generating molecule $Y$ with more desirable properties based on an input molecule $X$. The state-of-the-art approaches partition the molecules into a large set of substructures $S$ and grow the new molecule…

Machine Learning · Computer Science 2019-12-13 Tianfan Fu , Cao Xiao , Jimeng Sun

Controlling Output Rankings in Generative Engines for LLM-based Search

The way customers search for and choose products is changing with the rise of large language models (LLMs). LLM-based search, or generative engines, provides direct product recommendations to users, rather than traditional online search…

Computation and Language · Computer Science 2026-02-04 Haibo Jin , Ruoxi Chen , Peiyan Zhang , Yifeng Luo , Huimin Zeng , Man Luo , Haohan Wang

Improving Inference Performance of Machine Learning with the Divide-and-Conquer Principle

Many popular machine learning models scale poorly when deployed on CPUs. In this paper we explore the reasons why and propose a simple, yet effective approach based on the well-known Divide-and-Conquer Principle to tackle this problem of…

Machine Learning · Computer Science 2023-03-03 Alex Kogan

Towards Code-Oriented LM Embeddings for Surrogate-Assisted Neural Architecture Search

Developing effective surrogates (performance predictors) for Neural Architecture Search (NAS) typically requires expensive fine-tuning or the engineering of complex representations. We propose a low-cost embedding strategy that leverages…

Machine Learning · Computer Science 2026-05-18 Pranav Somu , Advay Balakrishnan , Stepan Kravtsov , Aaron McDaniel , Jason Zutty

QUEST: Query Optimization in Unstructured Document Analysis

Most recently, researchers have started building large language models (LLMs) powered data systems that allow users to analyze unstructured text documents like working with a database because LLMs are very effective in extracting attributes…

Databases · Computer Science 2025-07-14 Zhaoze Sun , Qiyan Deng , Chengliang Chai , Kaisen Jin , Xinyu Guo , Han Han , Ye Yuan , Guoren Wang , Lei Cao

GRACEFUL: A Learned Cost Estimator For UDFs

User-Defined-Functions (UDFs) are a pivotal feature in modern DBMS, enabling the extension of native DBMS functionality with custom logic. However, the integration of UDFs into query optimization processes poses significant challenges,…

Databases · Computer Science 2025-04-01 Johannes Wehrstein , Tiemo Bang , Roman Heinrich , Carsten Binnig

Effective Feature Learning with Unsupervised Learning for Improving the Predictive Models in Massive Open Online Courses

The effectiveness of learning in massive open online courses (MOOCs) can be significantly enhanced by introducing personalized intervention schemes which rely on building predictive models of student learning behaviors such as some…

Machine Learning · Computer Science 2018-12-20 Mucong Ding , Kai Yang , Dit-Yan Yeung , Ting-Chuen Pong

CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning

Large language models (LLMs) often solve challenging math exercises yet fail to apply the concept right when the problem requires genuine understanding. Popular Reinforcement Learning with Verifiable Rewards (RLVR) pipelines reinforce final…

Artificial Intelligence · Computer Science 2026-05-08 Zijun Gao , Zhikun Xu , Xiao Ye , Ben Zhou

Econometric Inference with Machine-Learned Proxies: Partial Identification via Data Combination

Empirical researchers increasingly use upstream machine-learning (ML) methods to construct proxies for latent target variables from complex, unstructured data. A naive plug-in use of such proxies in downstream econometric models, however,…

Econometrics · Economics 2026-04-14 Lixiong Li

PrediPrune: Reducing Verification Overhead in Souper with Machine Learning Driven Pruning

Souper is a powerful enumerative superoptimizer that enhances the runtime performance of programs by optimizing LLVM intermediate representation (IR) code. However, its verification process, which relies on a computationally expensive SMT…

Emerging Technologies · Computer Science 2025-09-23 Ange-Thierry Ishimwe , Raghuveer Shivakumar , Heewoo Kim , Tamara Lehman , Joseph Izraelevitz

Hydro: Adaptive Query Processing of ML Queries

Query optimization in relational database management systems (DBMSs) is critical for fast query processing. The query optimizer relies on precise selectivity and cost estimates to effectively optimize queries prior to execution. While this…

Databases · Computer Science 2024-03-25 Gaurav Tarlok Kakkar , Jiashen Cao , Aubhro Sengupta , Joy Arulraj , Hyesoon Kim

LLM-Powered Preference Elicitation in Combinatorial Assignment

We study the potential of large language models (LLMs) as proxies for humans to simplify preference elicitation (PE) in combinatorial assignment. While traditional PE methods rely on iterative queries to capture preferences, LLMs offer a…

Artificial Intelligence · Computer Science 2025-02-17 Ermis Soumalias , Yanchen Jiang , Kehang Zhu , Michael Curry , Sven Seuken , David C. Parkes

Improved Off-policy Reinforcement Learning in Biological Sequence Design

Designing biological sequences with desired properties is challenging due to vast search spaces and limited evaluation budgets. Although reinforcement learning methods use proxy models for rapid reward evaluation, insufficient training data…

Machine Learning · Computer Science 2025-06-18 Hyeonah Kim , Minsu Kim , Taeyoung Yun , Sanghyeok Choi , Emmanuel Bengio , Alex Hernández-García , Jinkyoo Park

Sufficient Decision Proxies for Decision-Focused Learning

When solving optimization problems under uncertainty with contextual data, utilizing machine learning to predict the uncertain parameters' values is a popular and effective approach. Decision-focused learning (DFL) aims at learning a…

Machine Learning · Computer Science 2026-01-29 Noah Schutte , Grigorii Veviurko , Krzysztof Postek , Neil Yorke-Smith

Accelerated Preference Elicitation with LLM-Based Proxies

Bidders in combinatorial auctions face significant challenges when describing their preferences to an auctioneer. Classical work on preference elicitation focuses on query-based techniques inspired from proper learning--often via proxies…

Computer Science and Game Theory · Computer Science 2025-12-23 David Huang , Francisco Marmolejo-Cossío , Edwin Lock , David Parkes

Robust Calibrate Proxy Loss for Deep Metric Learning

The mainstream researche in deep metric learning can be divided into two genres: proxy-based and pair-based methods. Proxy-based methods have attracted extensive attention due to the lower training complexity and fast network convergence.…

Information Retrieval · Computer Science 2023-04-19 Xinyue Li , Jian Wang , Wei Song , Yanling Du , Zhixiang Liu

CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there…

Machine Learning · Computer Science 2024-01-31 Andreas W. M. Sauter , Nicolò Botteghi , Erman Acar , Aske Plaat