English
Related papers

Related papers: Classification with High-Dimensional Sparse Sample…

200 papers

We consider a high dimensional binary classification problem and construct a classification procedure by minimizing the empirical misclassification risk with a penalty on the number of selected features. We derive non-asymptotic probability…

Methodology · Statistics 2018-11-26 Le-Yu Chen , Sokbae Lee

Motivated by real-world machine learning applications, we analyze approximations to the non-asymptotic fundamental limits of statistical classification. In the binary version of this problem, given two training sequences generated according…

Information Theory · Computer Science 2018-12-07 Lin Zhou , Vincent Y. F. Tan , Mehul Motani

We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster…

Machine Learning · Statistics 2016-11-17 Mohammad Hossein Rohban , Prakash Ishwar , Birant Orten , William C. Karl , Venkatesh Saligrama

In multiple classification, one aims to determine whether a testing sequence is generated from the same distribution as one of the M training sequences or not. Unlike most of existing studies that focus on discrete-valued sequences with…

Machine Learning · Statistics 2024-10-30 Lina Zhu , Lin Zhou

We formulate the sparse classification problem of $n$ samples with $p$ features as a binary convex optimization problem and propose a cutting-plane algorithm to solve it exactly. For sparse logistic regression and sparse SVM, our algorithm…

Optimization and Control · Mathematics 2025-01-08 Dimitris Bertsimas , Jean Pauphilet , Bart Van Parys

Motivated by problems of anomaly detection, this paper implements the Neyman-Pearson paradigm to deal with asymmetric errors in binary classification with a convex loss. Given a finite collection of classifiers, we combine them and obtain a…

Machine Learning · Statistics 2011-03-01 Philippe Rigollet , Xin Tong

Let $(Y,X_1,...,X_m)$ be a random vector. It is desired to predict $Y$ based on $(X_1,...,X_m)$. Examples of prediction methods are regression, classification using logistic regression or separating hyperplanes, and so on. We consider the…

Statistics Theory · Mathematics 2007-06-13 Eitan Greenshtein

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic…

Statistics Theory · Mathematics 2018-11-20 Felix Abramovich , Vadim Grinshtein

Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the…

Machine Learning · Computer Science 2019-04-23 Zengyou He , Chaohua Sheng , Yan Liu , Quan Zou

We consider a learning problem of identifying a dictionary matrix D (M times N dimension) from a sample set of M dimensional vectors Y = N^{-1/2} DX, where X is a sparse matrix (N times P dimension) in which the density of non-zero entries…

Machine Learning · Computer Science 2014-02-06 Ayaka Sakata , Yoshiyuki Kabashima

The small sample universal hypothesis testing problem is investigated in this paper, in which the number of samples $n$ is smaller than the number of possible outcomes $m$. The goal of this work is to find an appropriate criterion to…

Statistics Theory · Mathematics 2014-12-30 Dayu Huang , Sean Meyn

The Neyman-Pearson (NP) binary classification paradigm constrains the more severe type of error (e.g., the type I error) under a preferred level while minimizing the other (e.g., the type II error). This paradigm is suitable for…

Methodology · Statistics 2022-06-07 Jingming Wang , Lucy Xia , Zhigang Bao , Xin Tong

In this paper we consider high-dimensional multiclass classification by sparse multinomial logistic regression. We propose first a feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size…

Statistics Theory · Mathematics 2020-11-20 Felix Abramovich , Vadim Grinshtein , Tomer Levy

In this paper we consider the uniformity testing problem for high-dimensional discrete distributions (multinomials) under sparse alternatives. More precisely, we derive sharp detection thresholds for testing, based on $n$ samples, whether a…

Statistics Theory · Mathematics 2022-02-17 Bhaswar B. Bhattacharya , Rajarshi Mukherjee

In this work we consider a problem of multi-label classification, where each instance is associated with some binary vector. Our focus is to find a classifier which minimizes false negative discoveries under constraints. Depending on the…

Statistics Theory · Mathematics 2019-03-29 Evgenii Chzhen

In the binary hypothesis testing problem, it is well known that sequentiality in taking samples eradicates the trade-off between two error exponents, yet implementing the optimal test requires the knowledge of the underlying distributions,…

Information Theory · Computer Science 2025-01-07 Ching-Fang Li , I-Hsiang Wang

Let n denote the number of variables and m the number of equations in a sparse polynomial system over the binary field. We study the inconsistency probability of randomly generated sparse polynomial systems over the binary field, where each…

Probability · Mathematics 2026-03-27 P. Horak , I. Semaev

A Poisson Binomial distribution over $n$ variables is the distribution of the sum of $n$ independent Bernoullis. We provide a sample near-optimal algorithm for testing whether a distribution $P$ supported on $\{0,...,n\}$ to which we have…

Data Structures and Algorithms · Computer Science 2014-10-15 Jayadev Acharya , Constantinos Daskalakis

Given $n$ observations from two balanced classes, consider the task of labeling an additional $m$ inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with complete knowledge of…

Machine Learning · Statistics 2023-11-27 Patrik Róbert Gerber , Tianze Jiang , Yury Polyanskiy , Rui Sun

Given a training sample of size $m$ from a $d$-dimensional population, we wish to allocate a new observation $Z\in \R^d$ to this population or to the noise. We suppose that the difference between the distribution of the population and that…

Statistics Theory · Mathematics 2009-03-30 Yuri I. Ingster , Christophe Pouet , Alexandre B. Tsybakov
‹ Prev 1 2 3 10 Next ›