English
Related papers

Related papers: Sparse Adaptive Dirichlet-Multinomial-like Process…

200 papers

Dirichlet Process Mixtures (DPMs) are a popular class of statistical models to perform density estimation and clustering. However, when the data available have a distribution evolving over time, such models are inadequate. We introduce here…

Methodology · Statistics 2012-06-26 Francois Caron , Manuel Davy , Arnaud Doucet

We develop a sequential low-complexity inference procedure for Dirichlet process mixtures of Gaussians for online clustering and parameter estimation when the number of clusters are unknown a-priori. We present an easily computable, closed…

Machine Learning · Statistics 2015-09-15 Theodoros Tsiligkaridis , Keith W. Forsythe

Inherent in virtually every iterative machine learning algorithm is the problem of hyper-parameter tuning, which includes three major design parameters: (a) the complexity of the model, e.g., the number of neurons in a neural network, (b)…

Machine Learning · Computer Science 2025-09-26 Christos Mavridis , John Baras

The Dirichlet process mixture (DPM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as…

Machine Learning · Statistics 2014-11-05 Yordan P. Raykov , Alexis Boukouvalas , Max A. Little

The Dirichlet distribution, also known as multivariate beta, is the most used to analyse frequencies or proportions data. Maximum likelihood is widespread for estimation of Dirichlet's parameters. However, for small sample sizes, the…

Methodology · Statistics 2021-03-04 Vincenzo Gioia , Euloge Clovis Kenne Pagui

We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in distributed environments, where data are distributed across multiple computing nodes. A key advantage of Bayesian nonparametric models such as DPMMs is that they…

Machine Learning · Statistics 2017-09-20 Ruohui Wang , Dahua Lin

Consider a Dirichlet process mixture model (DPM) with random precision parameter $\alpha$, inducing $K_n$ clusters over $n$ observations through its latent random partition. Our goal is to specify the prior distribution…

Methodology · Statistics 2025-06-03 Carlo Vicentini , Ian Hyla Jermyn

Big data is ubiquitous in practices, and it has also led to heavy computation burden. To reduce the calculation cost and ensure the effectiveness of parameter estimators, an optimal subset sampling method is proposed to estimate the…

Methodology · Statistics 2023-11-16 Haohui Han , Liya Fu

Latent Dirichlet Allocation (LDA) is a topic model widely used in natural language processing and machine learning. Most approaches to training the model rely on iterative algorithms, which makes it difficult to run LDA on big corpora that…

Machine Learning · Statistics 2020-10-23 Alexander Terenin , Måns Magnusson , Leif Jonsson , David Draper

We study the problem of list-decodable sparse mean estimation. Specifically, for a parameter $\alpha \in (0, 1/2)$, we are given $m$ points in $\mathbb{R}^n$, $\lfloor \alpha m \rfloor$ of which are i.i.d. samples from a distribution $D$…

Data Structures and Algorithms · Computer Science 2024-07-08 Ilias Diakonikolas , Daniel M. Kane , Sushrut Karmalkar , Ankit Pensia , Thanasis Pittas

We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements. The problem occurs in many…

Machine Learning · Computer Science 2021-06-17 Talya Eden , Piotr Indyk , Shyam Narayanan , Ronitt Rubinfeld , Sandeep Silwal , Tal Wagner

Sparsity and low-rank models have been popular for reconstructing images and videos from limited or corrupted measurements. Dictionary or transform learning methods are useful in applications such as denoising, inpainting, and medical image…

Machine Learning · Statistics 2019-07-23 Brian E. Moore , Saiprasad Ravishankar , Raj Rao Nadakuditi , Jeffrey A. Fessler

Exchangeable random partition processes are the basis for Bayesian approaches to statistical inference in large alphabet settings. On the other hand, the notion of the pattern of a sequence provides an information-theoretic framework for…

Information Theory · Computer Science 2014-10-22 Narayana P. Santhanam , Anand D. Sarwate , Jae Oh Woo

For many tasks of data analysis, we may only have the information of the explanatory variable and the evaluation of the response values are quite expensive. While it is impractical or too costly to obtain the responses of all units, a…

Computation · Statistics 2023-04-07 Wei Zheng , Ting Tian , Xueqin Wang

Online reinforcement learning and other adaptive sampling algorithms are increasingly used in digital intervention experiments to optimize treatment delivery for users over time. In this work, we focus on longitudinal user data collected by…

Machine Learning · Computer Science 2023-04-20 Kelly W. Zhang , Lucas Janson , Susan A. Murphy

Given a collection of categorical data, we want to find the parameters of a Dirichlet distribution which maximizes the likelihood of that data. Newton's method is typically used for this purpose but current implementations require reading…

Machine Learning · Statistics 2023-05-30 Max Sklar

Large scale deep learning provides a tremendous opportunity to improve the quality of content recommendation systems by employing both wider and deeper models, but this comes at great infrastructural cost and carbon footprint in modern data…

Machine Learning · Computer Science 2020-10-22 Mao Ye , Dhruv Choudhary , Jiecao Yu , Ellie Wen , Zeliang Chen , Jiyan Yang , Jongsoo Park , Qiang Liu , Arun Kejariwal

In this paper we propose a computationally efficient algorithm for on-line variable selection in multivariate regression problems involving high dimensional data streams. The algorithm recursively extracts all the latent factors of a…

Machine Learning · Statistics 2009-02-10 Brian McWilliams , Giovanni Montana

A distributed adaptive algorithm is proposed to solve a node-specific parameter estimation problem where nodes are interested in estimating parameters of local interest, parameters of common interest to a subset of nodes and parameters of…

Computers and Society · Computer Science 2023-07-19 Jorge Plata-Chaves , Nikola Bogdanovic , Kostas Berberidis

The Pseudo-Marginal (PM) algorithm is a popular Markov chain Monte Carlo (MCMC) method used to sample from a target distribution when its density is inaccessible, but can be estimated with a non-negative unbiased estimator. Its performance…

Computation · Statistics 2025-09-30 Sarra Abaoubida , Mylène Bédard , Florian Maire
‹ Prev 1 2 3 10 Next ›