English
Related papers

Related papers: A score function for Bayesian cluster analysis

200 papers

Bayesian models offer great flexibility for clustering applications---Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For…

Machine Learning · Computer Science 2012-06-15 Brian Kulis , Michael I. Jordan

Identifying the number $K$ of clusters in a dataset is one of the most difficult problems in clustering analysis. A choice of $K$ that correctly characterizes the features of the data is essential for building meaningful clusters. In this…

Methodology · Statistics 2019-05-06 Adriano Zanin Zambom , Julian A. Collazos , Ronaldo Dias

Many clustering methods, including k-means, require the user to specify the number of clusters as an input parameter. A variety of methods have been devised to choose the number of clusters automatically, but they often rely on strong…

Methodology · Statistics 2017-02-10 Wei Fu , Patrick O. Perry

Comparison of three kind of the clustering and find cost function and loss function and calculate them. Error rate of the clustering methods and how to calculate the error percentage always be one on the important factor for evaluating the…

Machine Learning · Computer Science 2014-11-14 Kamran Kowsari

Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible,…

Methodology · Statistics 2025-12-12 David Buch , Miheer Dewaskar , David B. Dunson

Estimating the number of clusters (K) is a critical and often difficult task in cluster analysis. Many methods have been proposed to estimate K, including some top performers using resampling approach. When performing cluster analysis in…

Methodology · Statistics 2019-09-05 Yujia Li , Xiangrui Zeng , Chien-Wei Lin , George Tseng

Cluster analysis requires many decisions: the clustering method and the implied reference model, the number of clusters and, often, several hyper-parameters and algorithms' tunings. In practice, one produces several partitions, and a final…

Machine Learning · Statistics 2023-08-14 Luca Coraggio , Pietro Coretto

Fast and high quality document clustering is an important task in organizing information, search engine results obtaining from user query, enhancing web crawling and information retrieval. With the large amount of data available and with a…

Information Retrieval · Computer Science 2010-03-11 Alok Ranjan , Harish Verma , Eatesh Kandpal , Joydip Dhar

In cluster analysis interest lies in probabilistically capturing partitions of individuals, items or observations into groups, such that those belonging to the same group share similar attributes or relational profiles. Bayesian posterior…

Methodology · Statistics 2017-03-23 Riccardo Rastelli , Nial Friel

We show that model-based Bayesian clustering, the probabilistically most systematic approach to the partitioning of data, can be mapped into a statistical physics problem for a gas of particles, and as a result becomes amenable to a…

Disordered Systems and Neural Networks · Physics 2018-10-24 Alexander Mozeika , Anthony CC Coolen

Cluster analysis is a popular unsupervised learning tool used in many disciplines to identify heterogeneous sub-populations within a sample. However, validating cluster analysis results and determining the number of clusters in a data set…

Machine Learning · Statistics 2024-04-26 Ali Turfah , Xiaoquan Wen

Existing clustering algorithms such as K-means often need to preset parameters such as the number of categories K, and such parameters may lead to the failure to output objective and consistent clustering results. This paper introduces a…

Machine Learning · Computer Science 2022-09-15 Shaodong Deng , Long Sheng , Jiayi Nie , Fuyi Deng

Subspace clustering, the task of clustering high dimensional data when the data points come from a union of subspaces is one of the fundamental tasks in unsupervised machine learning. Most of the existing algorithms for this task require…

Machine Learning · Statistics 2020-10-28 Vishnu Menon , Gokularam M , Sheetal Kalyani

A clustering may be considered as fair on pre-specified sensitive attributes if the proportions of sensitive attribute groups in each cluster reflect that in the dataset. In this paper, we consider the task of fair clustering for scenarios…

Machine Learning · Computer Science 2020-01-27 Savitha Sam Abraham , Deepak P , Sowmya S Sundaram

The paper presents a novel approach for unsupervised techniques in the field of clustering. A new method is proposed to enhance existing literature models using the proper Bayesian bootstrap to improve results in terms of robustness and…

Machine Learning · Statistics 2024-09-16 Federico Maria Quetti , Silvia Figini , Elena ballante

The input of most clustering algorithms is a symmetric matrix quantifying similarity within data pairs. Such a matrix is here turned into a quadratic set function measuring cluster score or similarity within data subsets larger than pairs.…

Discrete Mathematics · Computer Science 2015-09-30 Giovanni Rossi

Clustering is widely studied in statistics and machine learning, with applications in a variety of fields. As opposed to classical algorithms which return a single clustering solution, Bayesian nonparametric models provide a posterior over…

Methodology · Statistics 2019-02-11 Sara Wade , Zoubin Ghahramani

Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters.…

Machine Learning · Statistics 2020-10-23 Zhiyue Zhang , Kenneth Lange , Jason Xu

Survey data are often collected under multistage sampling designs where units are binned to clusters that are sampled in a first stage. The unit-indexed population variables of interest are typically dependent within cluster. We propose a…

Methodology · Statistics 2021-08-26 Luis G. Leon-Novelo , Terrance D. Savitsky

Variable clustering is important for explanatory analysis. However, only few dedicated methods for variable clustering with the Gaussian graphical model have been proposed. Even more severe, small insignificant partial correlations due to…

Applications · Statistics 2018-06-18 Daniel Andrade , Akiko Takeda , Kenji Fukumizu
‹ Prev 1 2 3 10 Next ›