Related papers: Sparsifying generalized linear models
For any norms $N_1,\ldots,N_m$ on $\mathbb{R}^n$ and $N(x) := N_1(x)+\cdots+N_m(x)$, we show there is a sparsified norm $\tilde{N}(x) = w_1 N_1(x) + \cdots + w_m N_m(x)$ such that $|N(x) - \tilde{N}(x)| \leq \epsilon N(x)$ for all $x \in…
We introduce a notion of code sparsification that generalizes the notion of cut sparsification in graphs. For a (linear) code $\mathcal{C} \subseteq \mathbb{F}_q^n$ of dimension $k$ a $(1 \pm \epsilon)$-sparsification of size $s$ is given…
This paper considers sparsity in linear regression under the restriction that the regression weights sum to one. We propose an approach that combines $\ell_0$- and $\ell_1$-regularization. We compute its solution by adapting a recent…
Let $x_1,x_2,\ldots,x_m$ be elements of a convex cone $K$ such that their sum, $e$, is in the relative interior of $K$. An $\epsilon$-sparsification of the sum involves taking a subset of the $x_i$ and reweighting them by positive scalars,…
Two approximation algorithms are proposed for $\ell_1$-regularized sparse rank-1 approximation to higher-order tensors. The algorithms are based on multilinear relaxation and sparsification, which are easily implemented and well scalable.…
Recently there has been much interest in "sparsifying" sums of rank one matrices: modifying the coefficients such that only a few are nonzero, while approximately preserving the matrix that results from the sum. Results of this sort have…
We give almost-linear-time algorithms for constructing sparsifiers with $n\ poly(\log n)$ edges that approximately preserve weighted $(\ell^{2}_2 + \ell^{p}_p)$ flow or voltage objectives on graphs. For flow objectives, this is the first…
A simple sparse coding mechanism appears in the sensory systems of several organisms: to a coarse approximation, an input $x \in \R^d$ is mapped to much higher dimension $m \gg d$ by a random linear transformation, and is then sparsified by…
Many regression and classification procedures fit a parameterized function $f(x;w)$ of predictor variables $x$ to data $\{x_{i},y_{i}\}_1^N$ based on some loss criterion $L(y,f)$. Often, regularization is applied to improve accuracy by…
In this paper, we revisit spectral sparsification for sums of arbitrary positive semidefinite (PSD) matrices. Concretely, for any collection of PSD matrices $\mathcal{A} = \{A_1, A_2, \ldots, A_r\} \subset \mathbb{R}^{n \times n}$, given…
A $(1 \pm \epsilon)$-sparsifier of a hypergraph $G(V,E)$ is a (weighted) subgraph that preserves the value of every cut to within a $(1 \pm \epsilon)$-factor. It is known that every hypergraph with $n$ vertices admits a $(1 \pm…
We introduce a new notion of sparsification, called \emph{strong sparsification}, in which constraints are not removed but variables can be merged. As our main result, we present a strong sparsification algorithm for 1-in-3-SAT. The…
Discrepancy theory provides powerful tools for producing higher-quality objects which "beat the union bound" in fundamental settings throughout combinatorics and computer science. However, this quality has often come at the price of more…
Given a CNF formula F on n variables, the problem of model counting or #SAT is to compute the number of satisfying assignments of F . Model counting is a fundamental but hard problem in computer science with varied applications. Recent…
To improve federated training of neural networks, we develop FedSparsify, a sparsification strategy based on progressive weight magnitude pruning. Our method has several benefits. First, since the size of the network becomes increasingly…
In the context of sparse recovery, it is known that most of existing regularizers such as $\ell_1$ suffer from some bias incurred by some leading entries (in magnitude) of the associated vector. To neutralize this bias, we propose a class…
A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}\phi_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $\phi$'s,…
We obtain bounds on estimation error rates for regularization procedures of the form \begin{equation*} \hat f \in {\rm argmin}_{f\in F}\left(\frac{1}{N}\sum_{i=1}^N\left(Y_i-f(X_i)\right)^2+\lambda \Psi(f)\right) \end{equation*} when $\Psi$…
A cut sparsifier is a reweighted subgraph that maintains the weights of the cuts of the original graph up to a multiplicative factor of $(1\pm\epsilon)$. This paper considers computing cut sparsifiers of weighted graphs of size $O(n\log…
This is the second of two papers to describe a matrix sparsification algorithm that takes a general real or complex matrix as input and produces a sparse output matrix of the same size. The first paper presented the original algorithm, its…