Related papers: Random Maxout Features
Many machine learning applications such as in vision, biology and social networking deal with data in high dimensions. Feature selection is typically employed to select a subset of features which im- proves generalization accuracy as well…
The generalization gap of a classifier is related to the complexity of the set of functions among which the classifier is chosen. We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature.…
Supervised linear feature extraction can be achieved by fitting a reduced rank multivariate model. This paper studies rank penalized and rank constrained vector generalized linear models. From the perspective of thresholding rules, we build…
We provide a simulation algorithm to simulate from a (multivariate) characteristic function, which is only accessible in a black-box format. The method is based on a generative neural network, whose loss function exploits a specific…
We show how random feature maps can be used to forecast dynamical systems with excellent forecasting skill. We consider the tanh activation function and judiciously choose the internal weights in a data-driven manner such that the resulting…
Evaluation of generative models is mostly based on the comparison between the estimated distribution and the ground truth distribution in a certain feature space. To embed samples into informative features, previous works often use…
This paper studies a recent proposal to use randomized value functions to drive exploration in reinforcement learning. These randomized value functions are generated by injecting random noise into the training data, making the approach…
We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and…
The performance of visual quality prediction models is commonly assumed to be closely tied to their ability to capture perceptually relevant image aspects. Models are thus either based on sophisticated feature extractors carefully designed…
We consider distributed estimation of the inverse covariance matrix, also called the concentration or precision matrix, in Gaussian graphical models. Traditional centralized estimation often requires global inference of the covariance…
Generalized linear mixed models are useful in studying hierarchical data with possibly non-Gaussian responses. However, the intractability of likelihood functions poses challenges for estimation. We develop a new method suitable for this…
This work studies an experimental design problem where {the values of a predictor variable, denoted by $x$}, are to be determined with the goal of estimating a function $m(x)$, which is observed with noise. A linear model is fitted to…
A fundamental drawback of kernel-based statistical models is their limited scalability to large data sets, which requires resorting to approximations. In this work, we focus on the popular Gaussian kernel and on techniques to linearize…
Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…
When processing high-dimensional datasets, a common pre-processing step is feature selection. Filter-based feature selection algorithms are not tailored to a specific classification method, but rather rank the relevance of each feature with…
We present a probabilistic variant of the recently introduced maxout unit. The success of deep neural networks utilizing maxout can partly be attributed to favorable performance under dropout, when compared to rectified linear units. It…
Fitting linear regression models can be computationally very expensive in large-scale data analysis tasks if the sample size and the number of variables are very large. Random projections are extensively used as a dimension reduction tool…
We describe an approach to learning rich representations for images, that enables simple and effective predictors in a range of vision tasks involving spatially structured maps. Our key idea is to map small image elements to feature…
Clustering functional data is a challenging task due to intrinsic infinite-dimensionality and the need for stable, data-adaptive partitioning. In this work, we propose a clustering framework based on Random Projections, which simultaneously…
In this paper, we propose a new max-margin based discriminative feature learning method. Specifically, we aim at learning a low-dimensional feature representation, so as to maximize the global margin of the data and make the samples from…