English
Related papers

Related papers: A sequential algorithm for fast fitting of Dirichl…

200 papers

The Dirichlet Process (DP) mixture model has become a popular choice for model-based clustering, largely because it allows the number of clusters to be inferred. The sequential updating and greedy search (SUGS) algorithm (Wang and Dunson,…

Methodology · Statistics 2018-10-15 Oliver M. Crook , Laurent Gatto , Paul D. W. Kirk

In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive…

Machine Learning · Computer Science 2012-09-27 Ruefei He , Jonathan Shapiro

Dirichlet process (DP) mixture models provide a flexible Bayesian framework for density estimation. Unfortunately, their flexibility comes at a cost: inference in DP mixture models is computationally expensive, even when conjugate…

Machine Learning · Computer Science 2009-07-13 Hal Daumé

Mixtures of linear mixed models (MLMMs) are useful for clustering grouped data and can be estimated by likelihood maximization through the EM algorithm. The conventional approach to determining a suitable number of components is to compare…

Applications · Statistics 2014-05-26 Siew Li Tan , David J. Nott

We present a Dirichlet process mixture model over discrete incomplete rankings and study two Gibbs sampling inference techniques for estimating posterior clusterings. The first approach uses a slice sampling subcomponent for estimating…

Machine Learning · Computer Science 2012-03-19 Marina Meila , Harr Chen

Clustering mixed data presents numerous challenges inherent to the very heterogeneous nature of the variables. A clustering algorithm should be able, despite of this heterogeneity, to extract discriminant pieces of information from the…

Machine Learning · Computer Science 2022-05-10 Robin Fuchs , Denys Pommeret , Cinzia Viroli

The problem of relevant and diverse subset selection has a wide range of applications, including recommender systems and retrieval-augmented generation (RAG). For example, in recommender systems, one is interested in selecting relevant…

Machine Learning · Computer Science 2026-03-10 Vu Nguyen , Andrey Kan

Dirichlet Process Mixture Models (DPMMs) are widely used to address clustering problems. Their main advantage lies in their ability to automatically estimate the number of clusters during the inference process through the Bayesian…

Machine Learning · Statistics 2023-12-19 Reda Khoufache , Mustapha Lebbah , Hanene Azzag , Etienne Goffinet , Djamel Bouchaffra

Maximum weight matching is one of the most fundamental combinatorial optimization problems with a wide range of applications in data mining and bioinformatics. Developing distributed weighted matching algorithms is challenging due to the…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-06 Sepehr Assadi , MohammadHossein Bateni , Vahab Mirrokni

The Dirichlet process (DP) is a fundamental mathematical tool for Bayesian nonparametric modeling, and is widely used in tasks such as density estimation, natural language processing, and time series modeling. Although MCMC inference…

Machine Learning · Statistics 2013-04-09 Dan Lovell , Jonathan Malmaud , Ryan P. Adams , Vikash K. Mansinghka

Mixtures of multivariate normal inverse Gaussian (MNIG) distributions can be used to cluster data that exhibit features such as skewness and heavy tails. However, for cluster analysis, using a traditional finite mixture model framework,…

Methodology · Statistics 2020-05-13 Yuan Fang , Dimitris Karlis , Sanjeena Subedi

Reliable collision avoidance is one of the main requirements for autonomous driving. Hence, it is important to correctly estimate the states of an unknown number of static and dynamic objects in real-time. Here, data association is a major…

Computer Vision and Pattern Recognition · Computer Science 2019-03-11 Benjamin Naujoks , Patrick Burger , Hans-Joachim Wuensche

Scalable algorithms of posterior approximation allow Bayesian nonparametrics such as Dirichlet process mixture to scale up to larger dataset at fractional cost. Recent algorithms, notably the stochastic variational inference performs local…

Machine Learning · Computer Science 2025-02-25 Kart-Leong Lim , Xudong Jiang

Variational Bayesian (VB) methods produce posterior inference in a time frame considerably smaller than traditional Markov Chain Monte Carlo approaches. Although the VB posterior is an approximation, it has been shown to produce good…

Computation · Statistics 2019-08-02 Nathaniel Tomasetti , Catherine S. Forbes , Anastasios Panagiotelis

We develop a sequential low-complexity inference procedure for Dirichlet process mixtures of Gaussians for online clustering and parameter estimation when the number of clusters are unknown a-priori. We present an easily computable, closed…

Machine Learning · Statistics 2015-09-15 Theodoros Tsiligkaridis , Keith W. Forsythe

The Dirichlet process mixture (DPM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as…

Machine Learning · Statistics 2014-11-05 Yordan P. Raykov , Alexis Boukouvalas , Max A. Little

In the realm of unsupervised learning, Bayesian nonparametric mixture models, exemplified by the Dirichlet Process Mixture Model (DPMM), provide a principled approach for adapting the complexity of the model to the data. Such models are…

Machine Learning · Computer Science 2022-04-20 Or Dinari , Raz Zamir , John W. Fisher , Oren Freifeld

The goal of data clustering is to partition data points into groups to minimize a given objective function. While most existing clustering algorithms treat each data point as vector, in many applications each datum is not a vector but a…

Machine Learning · Statistics 2017-03-16 Dinh Phung , Ba-Ngu Bo

This paper presents a novel algorithm, based upon the dependent Dirichlet process mixture model (DDPMM), for clustering batch-sequential data containing an unknown number of evolving clusters. The algorithm is derived via a low-variance…

Machine Learning · Computer Science 2013-11-04 Trevor Campbell , Miao Liu , Brian Kulis , Jonathan P. How , Lawrence Carin

Modern datasets span billions of samples, making training on all available data infeasible. Selecting a high quality subset helps in reducing training costs and enhancing model quality. Submodularity, a discrete analogue of convexity, is…

Machine Learning · Computer Science 2025-04-04 Maximilian Böther , Abraham Sebastian , Pranjal Awasthi , Ana Klimovic , Srikumar Ramalingam
‹ Prev 1 2 3 10 Next ›