English
Related papers

Related papers: Computationally Efficient Estimators for Dimension…

200 papers

In recent years, large high-dimensional data sets have become commonplace in a wide range of applications in science and commerce. Techniques for dimension reduction are of primary concern in statistical analysis. Projection methods play an…

Computation · Statistics 2008-01-24 Peter Clifford , Ioana A. Cosma

Compressed Counting (CC) was recently proposed for very efficiently computing the (approximate) $\alpha$th frequency moments of data streams, where $0<\alpha <= 2$. Several estimators were reported including the geometric mean estimator,…

Data Structures and Algorithms · Computer Science 2008-08-14 Ping Li

We develop efficient binary (i.e., 1-bit) and multi-bit coding schemes for estimating the scale parameter of $\alpha$-stable distributions. The work is motivated by the recent work on one scan 1-bit compressed sensing (sparse signal…

Methodology · Statistics 2016-02-02 Ping Li

This paper will focus on three different aspects in improving the current practice of stable random projections. Firstly, we propose {\em very sparse stable random projections} to significantly reduce the processing and storage cost, by…

Data Structures and Algorithms · Computer Science 2007-07-13 Ping Li

Analyzing high-dimensional data with manifold learning algorithms often requires searching for the nearest neighbors of all observations. This presents a computational bottleneck in statistical manifold learning when observations of…

Machine Learning · Computer Science 2022-03-11 Fan Cheng , Anastasios Panagiotelis , Rob J Hyndman

Dimension reduction is often an important step in the analysis of high-dimensional data. PCA is a popular technique to find the best low-dimensional approximation of high-dimensional data. However, classical PCA is very sensitive to…

Computation · Statistics 2019-01-14 Holger Cevallos-Valdiviezo , Stefan Van Aelst

Designing scalable estimation algorithms is a core challenge in modern statistics. Here we introduce a framework to address this challenge based on parallel approximants, which yields estimators with provable properties that operate on the…

Methodology · Statistics 2023-08-04 Aritra Chakravorty , William S. Cleveland , Patrick J. Wolfe

We provide a simple method and relevant theoretical analysis for efficiently estimating higher-order lp distances. While the analysis mainly focuses on l4, our methodology extends naturally to p = 6,8,10..., (i.e., when p is even).…

Machine Learning · Computer Science 2012-03-19 Ping Li , Michael W. Mahoney , Yiyuan She

It is preferred that feature selectors be \textit{stable} for better interpretabity and robust prediction. Ensembling is known to be effective for improving the stability of feature selectors. Since ensembling is time-consuming, it is…

Machine Learning · Computer Science 2021-08-04 Rina Onda , Zhengyan Gao , Masaaki Kotera , Kenta Oono

We consider the question of learning the natural parameters of a $k$ parameter minimal exponential family from i.i.d. samples in a computationally and statistically efficient manner. We focus on the setting where the support as well as the…

Machine Learning · Computer Science 2021-11-01 Abhin Shah , Devavrat Shah , Gregory W. Wornell

We study the use of "sign $\alpha$-stable random projections" (where $0<\alpha\leq 2$) for building basic data processing tools in the context of large-scale machine learning applications (e.g., classification, regression, clustering, and…

Machine Learning · Statistics 2015-04-29 Ping Li

We propose a new estimator based on a linear programming method for smooth frontiers of sample points. The derivative of the frontier function is supposed to be Holder continuous.The estimator is defined as a linear combination of kernel…

Statistics Theory · Mathematics 2014-09-23 Alexander Nazin , Stephane Girard

We introduce sparse random projection, an important dimension-reduction tool from machine learning, for the estimation of discrete-choice models with high-dimensional choice sets. Initially, high-dimensional data are compressed into a…

Machine Learning · Statistics 2016-04-21 Khai X. Chiong , Matthew Shum

This paper addresses the estimation of locally stationary long-range dependent processes, a methodology that allows the statistical analysis of time series data exhibiting both nonstationarity and strong dependency. A time-varying…

Statistics Theory · Mathematics 2010-11-12 Wilfredo Palma , Ricardo Olea

Large language model (LLM) training is often bottlenecked by memory constraints and stochastic gradient noise in extremely high-dimensional parameter spaces. Motivated by empirical evidence that many LLM gradient matrices are effectively…

Machine Learning · Computer Science 2026-03-24 Zehao Li , Tao Ren , Zishi Zhang , Xi Chen , Yijie Peng

For many tasks of data analysis, we may only have the information of the explanatory variable and the evaluation of the response values are quite expensive. While it is impractical or too costly to obtain the responses of all units, a…

Computation · Statistics 2023-04-07 Wei Zheng , Ting Tian , Xueqin Wang

Algorithmic stability is a central concept in statistics and learning theory that measures how sensitive an algorithm's output is to small changes in the training data. Stability plays a crucial role in understanding generalization,…

Statistics Theory · Mathematics 2026-01-21 Abhinav Chakraborty , Yuetian Luo , Rina Foygel Barber

Distance queries are a basic tool in data analysis. They are used for detection and localization of change for the purpose of anomaly detection, monitoring, or planning. Distance queries are particularly useful when data sets such as…

Data Structures and Algorithms · Computer Science 2015-03-20 Edith Cohen

Subsampling is an efficient method to deal with massive data. In this paper, we investigate the optimal subsampling for linear quantile regression when the covariates are functions. The asymptotic distribution of the subsampling estimator…

Numerical Analysis · Mathematics 2022-05-06 Qian Yan , Hanyu Li , Chengmei Niu

Many applications using large datasets require efficient methods for minimizing a proximable convex function subject to satisfying a set of linear constraints within a specified tolerance. For this task, we present a proximal projection…

Optimization and Control · Mathematics 2024-12-10 Howard Heaton
‹ Prev 1 2 3 10 Next ›