Related papers: QuickSel: Quick Selectivity Learning with Mixture …

Towards Free Data Selection with General-Purpose Models

A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. However, current approaches, represented by active learning methods, typically follow a…

Computer Vision and Pattern Recognition · Computer Science 2023-10-17 Yichen Xie , Mingyu Ding , Masayoshi Tomizuka , Wei Zhan

Quick and Robust Feature Selection: the Strength of Energy-efficient Sparse Training for Autoencoders

Major complications arise from the recent increase in the amount of high-dimensional data, including high computational costs and memory requirements. Feature selection, which identifies the most relevant and informative attributes of a…

Machine Learning · Computer Science 2021-09-14 Zahra Atashgahi , Ghada Sokar , Tim van der Lee , Elena Mocanu , Decebal Constantin Mocanu , Raymond Veldhuis , Mykola Pechenizkiy

QUEST: An Efficient Query Evaluation Scheme Towards Scan-Intensive Cross-Model Analysis

Modern data-driven applications require that databases support fast cross-model analytical queries. Achieving fast analytical queries in a database system is challenging since they are usually scan-intensive (i.e., they need to intensively…

Databases · Computer Science 2023-09-22 Jianfeng Huang , Dongjing Miao , Xin Liu

QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization

Inference accounts for the majority of latency and energy consumption in large language model (LLM) deployments, often exceeding 90% of total cost. While training-time efficiency has seen extensive progress, runtime optimization remains a…

Computation and Language · Computer Science 2025-06-30 Danush Khanna , Aditya Kumar Guru , Srivarshinee Sridhar , Zidan Ahmed , Rubhav Bahirwani , Meetu Malhotra , Vinija Jain , Aman Chadha , Amitava Das , Kripabandhu Ghosh

On fine fluctuations of the complexity of the QuickSelect algorithm

The Quickselect algorithm (also called FIND) is a fundamental algorithm for selecting ranks or quantiles within a set of data. Gr\"ubel and R\"osler showed that the number of key comparisons required by Quickselect considered as a process…

Probability · Mathematics 2024-12-31 Jasper Ischebeck , Ralph Neininger

QuickCent: a fast and frugal heuristic for harmonic centrality estimation on scale-free networks

We present a simple and quick method to approximate network centrality indexes. Our approach, called QuickCent, is inspired by so-called fast and frugal heuristics, which are heuristics initially proposed to model some human decision and…

Social and Information Networks · Computer Science 2024-06-11 Francisco Plana , Andrés Abeliuk , Jorge Pérez

Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement

Finetuning large language models on instruction data is crucial for enhancing pre-trained knowledge and improving instruction-following capabilities. As instruction datasets proliferate, selecting optimal data for effective training becomes…

Computation and Language · Computer Science 2024-09-18 Simon Yu , Liangyu Chen , Sara Ahmadian , Marzieh Fadaee

Swift Sampler: Efficient Learning of Sampler by 10 Parameters

Data selection is essential for training deep learning models. An effective data sampler assigns proper sampling probability for training data and helps the model converge to a good local minimum with high performance. Previous studies in…

Machine Learning · Computer Science 2024-10-10 Jiawei Yao , Chuming Li , Canran Xiao

MOSEL: Inference Serving Using Dynamic Modality Selection

Rapid advancements over the years have helped machine learning models reach previously hard-to-achieve goals, sometimes even exceeding human capabilities. However, to attain the desired accuracy, the model sizes and in turn their…

Machine Learning · Computer Science 2023-10-31 Bodun Hu , Le Xu , Jeongyoon Moon , Neeraja J. Yadwadkar , Aditya Akella

A Learning Framework for Self-Tuning Histograms

In this paper, we consider the problem of estimating self-tuning histograms using query workloads. To this end, we propose a general learning theoretic formulation. Specifically, we use query feedback from a workload as training data to…

Databases · Computer Science 2011-12-05 Raajay Viswanathan , Prateek Jain , Srivatsan Laxman , Arvind Arasu

QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models

Image quality remains a key problem for both traditional and deep learning (DL)-based approaches to retinal image analysis, but identifying poor quality images can be time consuming and subjective. Thus, automated methods for retinal image…

Computer Vision and Pattern Recognition · Computer Science 2023-07-26 Justin Engelmann , Amos Storkey , Miguel O. Bernabeu

Efficient Data Subset Selection to Generalize Training Across Models: Transductive and Inductive Networks

Existing subset selection methods for efficient learning predominantly employ discrete combinatorial and model-specific approaches which lack generalizability. For an unseen architecture, one cannot use the subset chosen for a different…

Machine Learning · Computer Science 2024-09-20 Eeshaan Jain , Tushar Nandy , Gaurav Aggarwal , Ashish Tendulkar , Rishabh Iyer , Abir De

MILD: Modeling the Instance Learning Dynamics for Learning with Noisy Labels

Despite deep learning has achieved great success, it often relies on a large amount of training data with accurate labels, which are expensive and time-consuming to collect. A prominent direction to reduce the cost is to learn with noisy…

Machine Learning · Computer Science 2024-01-31 Chuanyang Hu , Shipeng Yan , Zhitong Gao , Xuming He

Consistent and Flexible Selectivity Estimation for High-Dimensional Data

Selectivity estimation aims at estimating the number of database objects that satisfy a selection criterion. Answering this problem accurately and efficiently is essential to many applications, such as density estimation, outlier detection,…

Databases · Computer Science 2021-05-28 Yaoshu Wang , Chuan Xiao , Jianbin Qin , Rui Mao , Onizuka Makoto , Wei Wang , Rui Zhang , Yoshiharu Ishikawa

SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training. This subset is selected based on an importance…

Machine Learning · Computer Science 2023-11-28 Habib Hajimolahoseini , Omar Mohamed Awad , Walid Ahmed , Austin Wen , Saina Asani , Mohammad Hassanpour , Farnoosh Javadi , Mehdi Ahmadi , Foozhan Ataiefard , Kangling Liu , Yang Liu

A Two-Phase Recall-and-Select Framework for Fast Model Selection

As the ubiquity of deep learning in various machine learning applications has amplified, a proliferation of neural network models has been trained and shared on public model repositories. In the context of a targeted machine learning…

Machine Learning · Computer Science 2024-04-02 Jianwei Cui , Wenhang Shi , Honglin Tao , Wei Lu , Xiaoyong Du

Novel Selectivity Estimation Strategy for Modern DBMS

Selectivity estimation is important in query optimization, however accurate estimation is difficult when predicates are complex. Instead of existing database synopses and statistics not helpful for such cases, we introduce a new approach to…

Databases · Computer Science 2018-06-25 Jun Hyung Shin

Fast Utility Mining on Complex Sequences

High-utility sequential pattern mining is an emerging topic in the field of Knowledge Discovery in Databases. It consists of discovering subsequences having a high utility (importance) in sequences, referred to as high-utility sequential…

Databases · Computer Science 2019-04-30 Wensheng Gan , Jerry Chun-Wei Lin , Jiexiong Zhang , Philippe Fournier-Viger , Han-Chieh Chao , Philip S. Yu

CheckSel: Efficient and Accurate Data-valuation Through Online Checkpoint Selection

Data valuation and subset selection have emerged as valuable tools for application-specific selection of important training data. However, the efficiency-accuracy tradeoffs of state-of-the-art methods hinder their widespread application to…

Machine Learning · Computer Science 2022-03-15 Soumi Das , Manasvi Sagarkar , Suparna Bhattacharya , Sourangshu Bhattacharya

Selective Embedding for Deep Learning

Deep learning has revolutionized many industries by enabling models to automatically learn complex patterns from raw data, reducing dependence on manual feature engineering. However, deep learning algorithms are sensitive to input data, and…

Machine Learning · Computer Science 2025-07-21 Mert Sehri , Zehui Hua , Francisco de Assis Boldt , Patrick Dumond