Related papers: A Framework for Private Matrix Analysis
Privacy preservation has become a critical concern in high-dimensional data analysis due to the growing prevalence of data-driven applications. Since its proposal, sliced inverse regression has emerged as a widely utilized statistical…
Principal component analysis (PCA) requires the computation of a low-rank approximation to a matrix containing the data being analyzed. In many applications of PCA, the best possible accuracy of any rank-deficient approximation is at most a…
In this paper, we describe a new algorithm to build a few sparse principal components from a given data matrix. Our approach does not explicitly create the covariance matrix of the data and can be viewed as an extension of the Kogbetliantz…
Numerical linear algebra plays an important role in computer science. In this paper, we initiate the study of performing linear algebraic tasks while preserving privacy when the data is streamed online. Our main focus is the space…
We introduce a novel algorithm that computes the $k$-sparse principal component of a positive semidefinite matrix $A$. Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a low-dimensional…
Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which…
We introduce a new method for sparse principal component analysis, based on the aggregation of eigenvector information from carefully-selected axis-aligned random projections of the sample covariance matrix. Unlike most alternative…
Sparse principal component analysis addresses the problem of finding a linear combination of the variables in a given data set with a sparse coefficients vector that maximizes the variability of the data. This model enhances the ability to…
Pseudospectral analysis is fundamental for quantifying the sensitivity and transient behavior of nonnormal matrices, yet its computational cost scales cubically with dimension, rendering it prohibitive for large-scale systems. While…
Private data analysis suffers a costly curse of dimensionality. However, the data often has an underlying low-dimensional structure. For example, when optimizing via gradient descent, the gradients often lie in or near a low-dimensional…
We develop a stochastic algorithm for independent component analysis that incorporates multi-trial supervision, which is available in many scientific contexts. The method blends a proximal gradient-type algorithm in the space of invertible…
We study low-rank matrix regression in settings where matrix-valued predictors and scalar responses are observed across multiple individuals. Rather than assuming a fully homogeneous coefficient matrices across individuals, we accommodate…
We provide computationally efficient, differentially private algorithms for the classical regression settings of Least Squares Fitting, Binary Regression and Linear Regression with unbounded covariates. Prior to our work, privacy…
Economics and social science research often require analyzing datasets of sensitive personal information at fine granularity, with models fit to small subsets of the data. Unfortunately, such fine-grained analysis can easily reveal…
Low-rank factorization is used in many areas of computer science where one performs spectral analysis on large sensitive data stored in the form of matrices. In this paper, we study differentially private low-rank factorization of a matrix…
We present new differentially private algorithms for learning a large-margin halfspace. In contrast to previous algorithms, which are based on either differentially private simulations of the statistical query model or on private convex…
Computing accurate low rank approximations of large matrices is a fundamental data mining task. In many applications however the matrix contains sensitive information about individuals. In such case we would like to release a low rank…
Low-rank approximation of a matrix by means of random sampling has been consistently efficient in its empirical studies by many scientists who applied it with various sparse and structured multipliers, but adequate formal support for this…
This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions,…
The sliding window model of computation captures scenarios in which data are continually arriving in the form of a stream, and only the most recent $w$ items are used for analysis. In this setting, an algorithm needs to accurately track…