Related papers: SURF: A Simple, Universal, Robust, Fast Distributi…

TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm

Approximating distributions from their samples is a canonical statistical-learning problem. One of its most powerful and successful modalities approximates every distribution to an $\ell_1$ distance essentially at most a constant times…

Machine Learning · Statistics 2022-06-22 Yi Hao , Ayush Jain , Alon Orlitsky , Vaishakh Ravindrakumar

Sample-Optimal Density Estimation in Nearly-Linear Time

We design a new, fast algorithm for agnostically learning univariate probability distributions whose densities are well approximated by piecewise polynomial functions. Let $f$ be the density function of an arbitrary univariate distribution,…

Data Structures and Algorithms · Computer Science 2015-06-03 Jayadev Acharya , Ilias Diakonikolas , Jerry Li , Ludwig Schmidt

Efficient Density Estimation via Piecewise Polynomial Approximation

We give a highly efficient "semi-agnostic" algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let $p$ be an arbitrary distribution over an interval $I$ which is…

Machine Learning · Computer Science 2013-05-15 Siu-On Chan , Ilias Diakonikolas , Rocco A. Servedio , Xiaorui Sun

Almost-Uniform Sampling of Points on High-Dimensional Algebraic Varieties

We consider the problem of uniform sampling of points on an algebraic variety. Specifically, we develop a randomized algorithm that, given a small set of multivariate polynomials over a sufficiently large finite field, produces a common…

Data Structures and Algorithms · Computer Science 2009-02-10 Mahdi Cheraghchi , Amin Shokrollahi

Polynomial Filtering for Fast Convergence in Distributed Consensus

In the past few years, the problem of distributed consensus has received a lot of attention, particularly in the framework of ad hoc sensor networks. Most methods proposed in the literature address the consensus averaging problem by…

Information Theory · Computer Science 2009-11-13 Effrosyni Kokiopoulou , Pascal Frossard

Stochastic Unrolled Federated Learning

Algorithm unrolling has emerged as a learning-based optimization paradigm that unfolds truncated iterative algorithms in trainable neural-network optimizers. We introduce Stochastic UnRolled Federated learning (SURF), a method that expands…

Machine Learning · Computer Science 2024-02-08 Samar Hadou , Navid NaderiAlizadeh , Alejandro Ribeiro

SURF: A Generalization Benchmark for GNNs Predicting Fluid Dynamics

Simulating fluid dynamics is crucial for the design and development process, ranging from simple valves to complex turbomachinery. Accurately solving the underlying physical equations is computationally expensive. Therefore, learning-based…

Machine Learning · Computer Science 2023-11-21 Stefan Künzli , Florian Grötschla , Joël Mathys , Roger Wattenhofer

Distributed Statistical Estimation and Rates of Convergence in Normal Approximation

This paper presents a class of new algorithms for distributed statistical estimation that exploit divide-and-conquer approach. We show that one of the key benefits of the divide-and-conquer strategy is robustness, an important…

Statistics Theory · Mathematics 2018-08-29 Stanislav Minsker , Nate Strawn

Iterative Chow Filtering for Learning with Distribution Shift

Recent work due to Goel et al. gave the first efficient algorithms for learning with distribution shift in the challenging PQ framework. In this setting, a learner receives labeled training examples, unlabeled test examples, and must make…

Data Structures and Algorithms · Computer Science 2026-05-19 Gautam Chandrasekaran , Georgios Gkrinias , Adam R. Klivans , Konstantinos Stavropoulos , Arsen Vasilyan

Proximal SCOPE for Distributed Sparse Learning: Better Data Partition Implies Faster Convergence Rate

Distributed sparse learning with a cluster of multiple machines has attracted much attention in machine learning, especially for large-scale applications with high-dimensional data. One popular way to implement sparse learning is to use…

Machine Learning · Statistics 2018-10-29 Shen-Yi Zhao , Gong-Duo Zhang , Ming-Wei Li , Wu-Jun Li

Quantization of Probability Distributions via Divide-and-Conquer: Convergence and Error Propagation under Distributional Arithmetic Operations

This article studies a general divide-and-conquer algorithm for approximating continuous one-dimensional probability distributions with finite mean. The article presents a numerical study that compares pre-existing approximation schemes…

Probability · Mathematics 2026-03-09 Bilgesu Arif Bilgin , Olof Hallqvist Elias , Michael Selby , Phillip Stanley-Marbell

On the Local Minima of the Empirical Risk

Population risk is always of primary interest in machine learning; however, learning algorithms only have access to the empirical risk. Even for applications with nonconvex nonsmooth losses (such as modern deep networks), the population…

Machine Learning · Computer Science 2018-10-19 Chi Jin , Lydia T. Liu , Rong Ge , Michael I. Jordan

Computation of Induced Orthogonal Polynomial Distributions

We provide a robust and general algorithm for computing distribution functions associated to induced orthogonal polynomial measures. We leverage several tools for orthogonal polynomials to provide a spectrally-accurate method for a broad…

Numerical Analysis · Mathematics 2017-04-28 Akil Narayan

On Local Distributed Sampling and Counting

In classic distributed graph problems, each instance on a graph specifies a space of feasible solutions (e.g. all proper ($\Delta+1$)-list-colorings of the graph), and the task of distributed algorithm is to construct a feasible solution…

Data Structures and Algorithms · Computer Science 2018-02-20 Weiming Feng , Yitong Yin

Partition-Merge: Distributed Inference and Modularity Optimization

This paper presents a novel meta algorithm, Partition-Merge (PM), which takes existing centralized algorithms for graph computation and makes them distributed and faster. In a nutshell, PM divides the graph into small subgraphs using our…

Data Structures and Algorithms · Computer Science 2013-09-25 Vincent Blondel , Kyomin Jung , Pushmeet Kohli , Devavrat Shah

Diffusion Posterior Sampling is Computationally Intractable

Diffusion models are a remarkably effective way of learning and sampling from a distribution $p(x)$. In posterior sampling, one is also given a measurement model $p(y \mid x)$ and a measurement $y$, and would like to sample from $p(x \mid…

Machine Learning · Computer Science 2025-11-11 Shivam Gupta , Ajil Jalal , Aditya Parulekar , Eric Price , Zhiyang Xun

Mixture Models, Robustness, and Sum of Squares Proofs

We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved…

Data Structures and Algorithms · Computer Science 2017-11-21 Samuel B. Hopkins , Jerry Li

The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Inspired by recent work on learning with distribution shift, we give a general outlier removal algorithm called iterative polynomial filtering and show a number of striking applications for supervised learning with contamination: (1) We…

Machine Learning · Computer Science 2026-01-13 Adam R. Klivans , Konstantinos Stavropoulos , Kevin Tian , Arsen Vasilyan

Fermion Sampling Made More Efficient

Fermion sampling is to generate probability distribution of a many-body Slater-determinant wavefunction, which is termed "determinantal point process" in statistical analysis. For its inherently-embedded Pauli exclusion principle, its…

Quantum Physics · Physics 2023-01-31 Haoran Sun , Jie Zou , Xiaopeng Li

Fourier Transform Approach to Machine Learning III: Fourier Classification

We propose a Fourier-based learning algorithm for highly nonlinear multiclass classification. The algorithm is based on a smoothing technique to calculate the probability distribution of all classes. To obtain the probability distribution,…

Machine Learning · Computer Science 2022-11-17 Soheil Mehrabkhani