English
Related papers

Related papers: Sub-Setting Algorithm for Training Data Selection …

200 papers

Existing subset selection methods for efficient learning predominantly employ discrete combinatorial and model-specific approaches which lack generalizability. For an unseen architecture, one cannot use the subset chosen for a different…

Machine Learning · Computer Science 2024-09-20 Eeshaan Jain , Tushar Nandy , Gaurav Aggarwal , Ashish Tendulkar , Rishabh Iyer , Abir De

Decision-making in complex systems often relies on machine learning models, yet highly accurate models such as XGBoost and neural networks can obscure the reasoning behind their predictions. In operations research applications,…

Machine Learning · Computer Science 2025-02-28 Gaurav Arwade , Sigurdur Olafsson

A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as…

Machine Learning · Computer Science 2020-02-21 Pietro Barbiero , Giovanni Squillero , Alberto Tonda

In order to speed-up classification models when facing a large number of categories, one usual approach consists in organizing the categories in a particular structure, this structure being then used as a way to speed-up the prediction…

Machine Learning · Computer Science 2015-11-26 Aurélia Léon , Ludovic Denoyer

Finding valuable training data points for deep neural networks has been a core research challenge with many applications. In recent years, various techniques for calculating the "value" of individual training datapoints have been proposed…

Machine Learning · Computer Science 2021-04-29 Soumi Das , Arshdeep Singh , Saptarshi Chatterjee , Suparna Bhattacharya , Sourangshu Bhattacharya

Top-down induction of decision trees has been observed to suffer from the inadequate functioning of the pruning phase. In particular, it is known that the size of the resulting tree grows linearly with the sample size, even though the…

Artificial Intelligence · Computer Science 2011-06-06 T. Elomaa , M. Kaariainen

Data classification techniques partition the data or feature space into smaller sub-spaces, each corresponding to a specific class. To classify into subspaces, physical features e.g., distance and distributions are utilized. This approach…

Machine Learning · Computer Science 2025-03-11 Josimar Chire , Khalid Mahmood , Zhao Liang

Decision trees are renowned for their ability to achieve high predictive performance while remaining interpretable, especially on tabular data. Traditionally, they are constructed through recursive algorithms, where they partition the data…

Machine Learning · Computer Science 2024-08-27 Yufan Zhuang , Liyuan Liu , Chandan Singh , Jingbo Shang , Jianfeng Gao

We present an algorithm, called the Offset Tree, for learning to make decisions in situations where the payoff of only one choice is observed, rather than all choices. The algorithm reduces this setting to binary classification, allowing…

Machine Learning · Computer Science 2016-04-05 Alina Beygelzimer , John Langford

The selection of algorithms is a crucial step in designing AI services for real-world time series classification use cases. Traditional methods such as neural architecture search, automated machine learning, combined algorithm selection,…

Machine Learning · Computer Science 2024-10-02 Lars Böcking , Leopold Müller , Niklas Kühl

Deep learning has revolutionized many industries by enabling models to automatically learn complex patterns from raw data, reducing dependence on manual feature engineering. However, deep learning algorithms are sensitive to input data, and…

Machine Learning · Computer Science 2025-07-21 Mert Sehri , Zehui Hua , Francisco de Assis Boldt , Patrick Dumond

Modern datasets span billions of samples, making training on all available data infeasible. Selecting a high quality subset helps in reducing training costs and enhancing model quality. Submodularity, a discrete analogue of convexity, is…

Machine Learning · Computer Science 2025-04-04 Maximilian Böther , Abraham Sebastian , Pranjal Awasthi , Ana Klimovic , Srikumar Ramalingam

As with any task, the process of building machine learning models can benefit from prior experience. Meta-learning for classifier selection leverages knowledge about the characteristics of different datasets and/or the past performance of…

Machine Learning · Computer Science 2025-08-26 Sebastian Maldonado , Carla Vairetti , Ignacio Figueroa

Data-driven algorithm design is a paradigm that uses statistical and machine learning techniques to select from a class of algorithms for a computational problem an algorithm that has the best expected performance with respect to some…

Machine Learning · Computer Science 2024-06-05 Hongyu Cheng , Sammy Khalife , Barbara Fiedorowicz , Amitabh Basu

Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts. Training data…

Computer Vision and Pattern Recognition · Computer Science 2018-05-30 Vishal Kaushal , Anurag Sahoo , Khoshrav Doctor , Narasimha Raju , Suyash Shetty , Pankaj Singh , Rishabh Iyer , Ganesh Ramakrishnan

Decision trees and their ensembles are popular in machine learning as easy-to-understand models. Several techniques have been proposed in the literature for learning tree-based classifiers, with different techniques working well for data…

Machine Learning · Computer Science 2025-05-20 Maria-Florina Balcan , Dravyansh Sharma

Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples and not only within classes. However, standard classification methods do not take…

Machine Learning · Computer Science 2015-05-19 Alejandro Correa Bahnsen , Djamila Aouada , Bjorn Ottersten

Current machine learning has made great progress on computer vision and many other fields attributed to the large amount of high-quality training samples, while it does not work very well on genomic data analysis, since they are notoriously…

Machine Learning · Computer Science 2020-09-04 Ziyi Yang , Jun Shu , Yong Liang , Deyu Meng , Zongben Xu

In training neural networks, it is common practice to use partial gradients computed over batches, mostly very small subsets of the training set. This approach is motivated by the argument that such a partial gradient is close to the true…

Machine Learning · Computer Science 2024-11-25 Jan Spörer , Bernhard Bermeitinger , Tomas Hrycej , Niklas Limacher , Siegfried Handschuh

There has been recent interest in improving performance of simple models for multiple reasons such as interpretability, robust learning from small data, deployment in memory constrained settings as well as environmental considerations. In…

Machine Learning · Computer Science 2020-06-23 Amit Dhurandhar , Karthikeyan Shanmugam , Ronny Luss
‹ Prev 1 2 3 10 Next ›