Related papers: Multi-Attribute Selectivity Estimation Using Deep …
Selectivity estimation aims at estimating the number of database objects that satisfy a selection criterion. Answering this problem accurately and efficiently is essential to many applications, such as density estimation, outlier detection,…
Relational query optimisers rely on cost models to choose between different query execution plans. Selectivity estimates are known to be a crucial input to the cost model. In practice, standard selectivity estimation procedures are prone to…
Due to the outstanding capability of capturing underlying data distributions, deep learning techniques have been recently utilized for a series of traditional database problems. In this paper, we investigate the possibilities of utilizing…
The amount of information in the form of features and variables avail- able to machine learning algorithms is ever increasing. This can lead to classifiers that are prone to overfitting in high dimensions, high di- mensional models do not…
Feature selection is an important problem in high-dimensional data analysis and classification. Conventional feature selection approaches focus on detecting the features based on a redundancy criterion using learning and feature searching…
Recent research in feature learning has been extended to sequence data, where each instance consists of a sequence of heterogeneous items with a variable length. However, in many real-world applications, the data exists in the form of…
This paper addresses the task of set prediction using deep learning. This is important because the output of many computer vision tasks, including image tagging and object detection, are naturally expressed as sets of entities rather than…
Accurate probabilistic predictions can be characterized by two properties -- calibration and sharpness. However, standard maximum likelihood training yields models that are poorly calibrated and thus inaccurate -- a 90% confidence interval…
Density ratio estimation serves as an important technique in the unsupervised machine learning toolbox. However, such ratios are difficult to estimate for complex, high-dimensional data, particularly when the densities of interest are…
One of the fundamental problems in machine learning is the estimation of a probability distribution from data. Many techniques have been proposed to study the structure of data, most often building around the assumption that observations…
Deep learning has revolutionized many industries by enabling models to automatically learn complex patterns from raw data, reducing dependence on manual feature engineering. However, deep learning algorithms are sensitive to input data, and…
Dynamic feature selection, where we sequentially query features to make accurate predictions with a minimal budget, is a promising paradigm to reduce feature acquisition costs and provide transparency into a model's predictions. The problem…
View materialization, index selection, and plan caching are well-known techniques for optimization of query processing in database systems. The essence of these tasks is to select and save a subset of the most useful candidates…
The growth and success of deep learning approaches can be attributed to two major factors: availability of hardware resources and availability of large number of training samples. For problems with large training databases, deep learning…
A general formulation of optimization problems in which various candidate solutions may use different feature-sets is presented, encompassing supervised classification, automated program learning and other cases. A novel characterization of…
Mixed-integer optimisation problems can be computationally challenging. Here, we introduce and analyse two efficient algorithms with a specific sequential design that are aimed at dealing with sampled problems within this class. At each…
Bin Packing problems have been widely studied because of their broad applications in different domains. Known as a set of NP-hard problems, they have different vari- ations and many heuristics have been proposed for obtaining approximate…
Deep neural network models have demonstrated their effectiveness in classifying multi-label data from various domains. Typically, they employ a training mode that combines mini-batches with optimizers, where each sample is randomly selected…
Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts. Training data…
As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…