English
Related papers

Related papers: QuickSel: Quick Selectivity Learning with Mixture …

200 papers

A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. However, current approaches, represented by active learning methods, typically follow a…

Computer Vision and Pattern Recognition · Computer Science 2023-10-17 Yichen Xie , Mingyu Ding , Masayoshi Tomizuka , Wei Zhan

Major complications arise from the recent increase in the amount of high-dimensional data, including high computational costs and memory requirements. Feature selection, which identifies the most relevant and informative attributes of a…

Modern data-driven applications require that databases support fast cross-model analytical queries. Achieving fast analytical queries in a database system is challenging since they are usually scan-intensive (i.e., they need to intensively…

Databases · Computer Science 2023-09-22 Jianfeng Huang , Dongjing Miao , Xin Liu

Inference accounts for the majority of latency and energy consumption in large language model (LLM) deployments, often exceeding 90% of total cost. While training-time efficiency has seen extensive progress, runtime optimization remains a…

The Quickselect algorithm (also called FIND) is a fundamental algorithm for selecting ranks or quantiles within a set of data. Gr\"ubel and R\"osler showed that the number of key comparisons required by Quickselect considered as a process…

Probability · Mathematics 2024-12-31 Jasper Ischebeck , Ralph Neininger

We present a simple and quick method to approximate network centrality indexes. Our approach, called QuickCent, is inspired by so-called fast and frugal heuristics, which are heuristics initially proposed to model some human decision and…

Social and Information Networks · Computer Science 2024-06-11 Francisco Plana , Andrés Abeliuk , Jorge Pérez

Finetuning large language models on instruction data is crucial for enhancing pre-trained knowledge and improving instruction-following capabilities. As instruction datasets proliferate, selecting optimal data for effective training becomes…

Computation and Language · Computer Science 2024-09-18 Simon Yu , Liangyu Chen , Sara Ahmadian , Marzieh Fadaee

Data selection is essential for training deep learning models. An effective data sampler assigns proper sampling probability for training data and helps the model converge to a good local minimum with high performance. Previous studies in…

Machine Learning · Computer Science 2024-10-10 Jiawei Yao , Chuming Li , Canran Xiao

Rapid advancements over the years have helped machine learning models reach previously hard-to-achieve goals, sometimes even exceeding human capabilities. However, to attain the desired accuracy, the model sizes and in turn their…

Machine Learning · Computer Science 2023-10-31 Bodun Hu , Le Xu , Jeongyoon Moon , Neeraja J. Yadwadkar , Aditya Akella

In this paper, we consider the problem of estimating self-tuning histograms using query workloads. To this end, we propose a general learning theoretic formulation. Specifically, we use query feedback from a workload as training data to…

Databases · Computer Science 2011-12-05 Raajay Viswanathan , Prateek Jain , Srivatsan Laxman , Arvind Arasu

Image quality remains a key problem for both traditional and deep learning (DL)-based approaches to retinal image analysis, but identifying poor quality images can be time consuming and subjective. Thus, automated methods for retinal image…

Computer Vision and Pattern Recognition · Computer Science 2023-07-26 Justin Engelmann , Amos Storkey , Miguel O. Bernabeu

Existing subset selection methods for efficient learning predominantly employ discrete combinatorial and model-specific approaches which lack generalizability. For an unseen architecture, one cannot use the subset chosen for a different…

Machine Learning · Computer Science 2024-09-20 Eeshaan Jain , Tushar Nandy , Gaurav Aggarwal , Ashish Tendulkar , Rishabh Iyer , Abir De

Despite deep learning has achieved great success, it often relies on a large amount of training data with accurate labels, which are expensive and time-consuming to collect. A prominent direction to reduce the cost is to learn with noisy…

Machine Learning · Computer Science 2024-01-31 Chuanyang Hu , Shipeng Yan , Zhitong Gao , Xuming He

Selectivity estimation aims at estimating the number of database objects that satisfy a selection criterion. Answering this problem accurately and efficiently is essential to many applications, such as density estimation, outlier detection,…

Databases · Computer Science 2021-05-28 Yaoshu Wang , Chuan Xiao , Jianbin Qin , Rui Mao , Onizuka Makoto , Wei Wang , Rui Zhang , Yoshiharu Ishikawa

In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training. This subset is selected based on an importance…

As the ubiquity of deep learning in various machine learning applications has amplified, a proliferation of neural network models has been trained and shared on public model repositories. In the context of a targeted machine learning…

Machine Learning · Computer Science 2024-04-02 Jianwei Cui , Wenhang Shi , Honglin Tao , Wei Lu , Xiaoyong Du

Selectivity estimation is important in query optimization, however accurate estimation is difficult when predicates are complex. Instead of existing database synopses and statistics not helpful for such cases, we introduce a new approach to…

Databases · Computer Science 2018-06-25 Jun Hyung Shin

High-utility sequential pattern mining is an emerging topic in the field of Knowledge Discovery in Databases. It consists of discovering subsequences having a high utility (importance) in sequences, referred to as high-utility sequential…

Data valuation and subset selection have emerged as valuable tools for application-specific selection of important training data. However, the efficiency-accuracy tradeoffs of state-of-the-art methods hinder their widespread application to…

Machine Learning · Computer Science 2022-03-15 Soumi Das , Manasvi Sagarkar , Suparna Bhattacharya , Sourangshu Bhattacharya

Deep learning has revolutionized many industries by enabling models to automatically learn complex patterns from raw data, reducing dependence on manual feature engineering. However, deep learning algorithms are sensitive to input data, and…

Machine Learning · Computer Science 2025-07-21 Mert Sehri , Zehui Hua , Francisco de Assis Boldt , Patrick Dumond
‹ Prev 1 2 3 10 Next ›