English
Related papers

Related papers: Consistent and Flexible Selectivity Estimation for…

200 papers

Selectivity estimation - the problem of estimating the result size of queries - is a fundamental problem in databases. Accurate estimation of query selectivity involving multiple correlated attributes is especially challenging. Poor…

Databases · Computer Science 2019-06-19 Shohedul Hasan , Saravanan Thirumuruganathan , Jees Augustine , Nick Koudas , Gautam Das

Partition-wise models offer a flexible approach for modeling complex and multidimensional data that are capable of producing interpretable results. They are based on partitioning the observed data into regions, each of which is modeled with…

Methodology · Statistics 2017-06-07 Rex C. Y. Cheung , Alexander Aue , Thomas C. M. Lee

Machine learning models usually assume that a set of feature values used to obtain an output is fixed in advance. However, in many real-world problems, a cost is associated with measuring these features. To address the issue of reducing…

Machine Learning · Computer Science 2025-03-13 Katsumi Takahashi , Koh Takeuchi , Hisashi Kashima

Multiobjective feature selection seeks to determine the most discriminative feature subset by simultaneously optimizing two conflicting objectives: minimizing the number of selected features and the classification error rate. The goal is to…

Neural and Evolutionary Computing · Computer Science 2025-05-12 Zhenxing Zhang , Qianxiang An , Yilei Wang , Chenfeng Wu , Baoling Dong , Chunjie Zhou

Feature selection is a critical step in the analysis of high-dimensional data, where the number of features often vastly exceeds the number of samples. Effective feature selection not only improves model performance and interpretability but…

Machine Learning · Computer Science 2025-01-27 Raquel Espinosa , Gracia Sánchez , José Palma , Fernando Jiménez

Large-scale Hierarchical Classification (HC) involves datasets consisting of thousands of classes and millions of training instances with high-dimensional features posing several big data challenges. Feature selection that aims to select…

Machine Learning · Computer Science 2017-06-07 Azad Naik , Huzefa Rangwala

One of the fundamental problems in machine learning is the estimation of a probability distribution from data. Many techniques have been proposed to study the structure of data, most often building around the assumption that observations…

Machine Learning · Statistics 2013-02-22 Oren Rippel , Ryan Prescott Adams

The purpose of feature extraction on convolutional neural networks is to reuse deep representations learnt for a pre-trained model to solve a new, potentially unrelated problem. However, raw feature extraction from all layers is unfeasible…

Neural and Evolutionary Computing · Computer Science 2019-11-11 Victor Gimenez-Abalos , Armand Vilalta , Dario Garcia-Gasulla , Jesus Labarta , Eduard Ayguadé

The amount of information in the form of features and variables avail- able to machine learning algorithms is ever increasing. This can lead to classifiers that are prone to overfitting in high dimensions, high di- mensional models do not…

Machine Learning · Computer Science 2014-02-12 Aaron Karper

Many computer vision and medical imaging problems are faced with learning from large-scale datasets, with millions of observations and features. In this paper we propose a novel efficient learning scheme that tightens a sparsity constraint…

Machine Learning · Statistics 2017-02-07 Adrian Barbu , Yiyuan She , Liangjing Ding , Gary Gramajo

Variable selection for high-dimensional, highly correlated data has long been a challenging problem, often yielding unstable and unreliable models. We propose a resample-aggregate framework that exploits diffusion models' ability to…

Methodology · Statistics 2025-08-20 Minjie Wang , Xiaotong Shen , Wei Pan

Selecting relevant features is an important and necessary step for intelligent machines to maximize their chances of success. However, intelligent machines generally have no enough computing resources when faced with huge volume of data.…

Machine Learning · Computer Science 2025-07-04 Hexiang Bai , Deyu Li , Jiye Liang , Yanhui Zhai

Discovering valuable insights from data through meaningful associations is a crucial task. However, it becomes challenging when trying to identify representative patterns in quantitative databases, especially with large datasets, as…

Databases · Computer Science 2024-10-31 Lamine Diop , Marc Plantevit

In recent years, deep learning has been at the center of analytics due to its impressive empirical success in analyzing complex data objects. Despite this success, most of the existing tools behave like black-box machines, thus the…

Machine Learning · Statistics 2022-11-02 Arkaprabha Ganguli , David Todem , Tapabrata Maiti

The goal of Feature Selection - comprising filter, wrapper, and embedded approaches - is to find the optimal feature subset for designated downstream tasks. Nevertheless, current feature selection methods are limited by: 1) the selection…

Machine Learning · Computer Science 2023-09-18 Meng Xiao , Dongjie Wang , Min Wu , Pengfei Wang , Yuanchun Zhou , Yanjie Fu

We consider the high-dimensional discriminant analysis problem. For this problem, different methods have been proposed and justified by establishing exact convergence rates for the classification risk, as well as the l2 convergence results…

Machine Learning · Statistics 2013-06-28 Mladen Kolar , Han Liu

As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…

Computation · Statistics 2013-03-22 Jeffrey L. Andrews , Paul D. McNicholas

A key obstacle in automated analytics and meta-learning is the inability to recognize when different datasets contain measurements of the same variable. Because provided attribute labels are often uninformative in practice, this task may be…

Machine Learning · Computer Science 2019-09-12 Jonas Mueller , Alex Smola

Dealing with distribution shifts is one of the central challenges for modern machine learning. One fundamental situation is the covariate shift, where the input distributions of data change from training to testing stages while the…

Machine Learning · Computer Science 2024-05-28 Yu-Jie Zhang , Zhen-Yu Zhang , Peng Zhao , Masashi Sugiyama

When selecting data for training large-scale models, standard practice is to filter for examples that match human notions of data quality. Such filtering yields qualitatively clean datapoints that intuitively should improve model behavior.…

Machine Learning · Computer Science 2024-01-24 Logan Engstrom , Axel Feldmann , Aleksander Madry
‹ Prev 1 2 3 10 Next ›