English
Related papers

Related papers: Nearly Linear Row Sampling Algorithm for Quantile …

200 papers

In this work we provide a new technique to design fast approximation algorithms for graph problems where the points of the graph lie in a metric space. Specifically, we present a sampling approach for such metric graphs that, using a…

Data Structures and Algorithms · Computer Science 2018-07-26 Hossein Esfandiari , Michael Mitzenmacher

To accelerate kernel methods, we propose a near input sparsity time algorithm for sampling the high-dimensional feature space implicitly defined by a kernel transformation. Our main contribution is an importance sampling method for…

Data Structures and Algorithms · Computer Science 2020-07-15 David P. Woodruff , Amir Zandieh

We give a simple algorithm to efficiently sample the rows of a matrix while preserving the p-norms of its product with vectors. Given an $n$-by-$d$ matrix $\boldsymbol{\mathit{A}}$, we find with high probability and in input sparsity time…

Data Structures and Algorithms · Computer Science 2014-12-02 Michael B. Cohen , Richard Peng

The paper analyzes theoretically and empirically the performance of likelihood weighting (LW) on a subset of nodes in Bayesian networks. The proposed scheme requires fewer samples to converge due to reduction in sampling variance. The…

Artificial Intelligence · Computer Science 2012-07-02 Bozhena Bidyuk , Rina Dechter

We studied linear weighted sampling algorithms and their optimality for approximate recovery of functions with mixed smoothness on $\mathbb{R}^d$ from a set of $n$ their sampled values. Functions to be recovered are in weighted Sobolev…

Numerical Analysis · Mathematics 2025-11-11 Dinh Dũng

A significant hurdle for analyzing large sample data is the lack of effective statistical computing and inference methods. An emerging powerful approach for analyzing large sample data is subsampling, by which one takes a random subsample…

Methodology · Statistics 2015-11-24 Rong Zhu , Ping Ma , Michael W. Mahoney , Bin Yu

In statistics and machine learning, logistic regression is a widely-used supervised learning technique primarily employed for binary classification tasks. When the number of observations greatly exceeds the number of predictor variables, we…

Machine Learning · Statistics 2024-04-02 Agniva Chowdhury , Pradeep Ramuhalli

Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time…

Data Structures and Algorithms · Computer Science 2014-08-22 Michael B. Cohen , Yin Tat Lee , Cameron Musco , Christopher Musco , Richard Peng , Aaron Sidford

Subsampling is an efficient method to deal with massive data. In this paper, we investigate the optimal subsampling for linear quantile regression when the covariates are functions. The asymptotic distribution of the subsampling estimator…

Numerical Analysis · Mathematics 2022-05-06 Qian Yan , Hanyu Li , Chengmei Niu

We solve the analysis sparse coding problem considering a combination of convex and non-convex sparsity promoting penalties. The multi-penalty formulation results in an iterative algorithm involving proximal-averaging. We then unfold the…

We initiate the study of numerical linear algebra in the sliding window model, where only the most recent $W$ updates in a stream form the underlying data set. We first introduce a unified row-sampling based framework that gives randomized…

Data Structures and Algorithms · Computer Science 2023-04-12 Vladimir Braverman , Petros Drineas , Cameron Musco , Christopher Musco , Jalaj Upadhyay , David P. Woodruff , Samson Zhou

In this paper, we focus on distributed estimation and support recovery for high-dimensional linear quantile regression. Quantile regression is a popular alternative tool to the least squares regression for robustness against outliers and…

Machine Learning · Statistics 2024-06-04 Caixing Wang , Ziliang Shen

Linear Regression is a seminal technique in statistics and machine learning, where the objective is to build linear predictive models between a response (i.e., dependent) variable and one or more predictor (i.e., independent) variables. In…

Computational Geometry · Computer Science 2023-07-19 Suraj Shetiya , Shohedul Hasan , Abolfazl Asudeh , Gautam Das

The goal of this paper is to propose novel strategies for adaptive learning of signals defined over graphs, which are observed over a (randomly time-varying) subset of vertices. We recast two classical adaptive algorithms in the graph…

Machine Learning · Computer Science 2018-08-01 Paolo Di Lorenzo , Paolo Banelli , Elvin Isufi , Sergio Barbarossa , Geert Leus

The goal of model compression is to reduce the size of a large neural network while retaining a comparable performance. As a result, computation and memory costs in resource-limited applications may be significantly reduced by dropping…

Machine Learning · Statistics 2022-11-10 Wenjing Yang , Ganghua Wang , Jie Ding , Yuhong Yang

Least-squares approximation is one of the most important methods for recovering an unknown function from data. While in many applications the data is fixed, in many others there is substantial freedom to choose where to sample. In this…

Machine Learning · Statistics 2025-08-11 Ben Adcock

Quantile regression is a powerful statistical methodology that complements the classical linear regression by examining how covariates influence the location, scale, and shape of the entire response distribution and offering a global view…

Applications · Statistics 2013-09-11 Lu Xiaoming , Fan Zhaozhi

The seminal work of Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community, yielding fast row sampling algorithms for approximating $d$-dimensional subspaces of $\ell_p$ up to $(1+\epsilon)$ error.…

Data Structures and Algorithms · Computer Science 2022-12-20 David P. Woodruff , Taisuke Yasuda

We give the first polynomial-time algorithm for performing linear or polynomial regression resilient to adversarial corruptions in both examples and labels. Given a sufficiently large (polynomial-size) training set drawn i.i.d. from…

Machine Learning · Computer Science 2020-06-05 Adam Klivans , Pravesh K. Kothari , Raghu Meka

For massive data, the family of subsampling algorithms is popular to downsize the data volume and reduce computational burden. Existing studies focus on approximating the ordinary least squares estimate in linear regression, where…

Computation · Statistics 2019-06-27 HaiYing Wang , Rong Zhu , Ping Ma
‹ Prev 1 2 3 10 Next ›