Related papers: Clustering using Unsupervised Binary Trees: CUBT
We herein introduce a new method of interpretable clustering that uses unsupervised binary trees. It is a three-stage procedure, the first stage of which entails a series of recursive binary splits to reduce the heterogeneity of the data…
In this paper, we address an issue of finding explainable clusters of class-uniform data in labelled datasets. The issue falls into the domain of interpretable supervised clustering. Unlike traditional clustering, supervised clustering aims…
Constrained clustering is a semi-supervised task that employs a limited amount of labelled data, formulated as constraints, to incorporate domain-specific knowledge and to significantly improve clustering accuracy. Previous work has…
State-of-the-art clustering algorithms use heuristics to partition the feature space and provide little insight into the rationale for cluster membership, limiting their interpretability. In healthcare applications, the latter poses a…
Clustering serves as a vital tool for uncovering latent data structures, and achieving both high accuracy and interpretability is essential. To this end, existing methods typically construct binary decision trees by solving mixed-integer…
Semi-supervised clustering methods incorporate a limited amount of supervision into the clustering process. Typically, this supervision is provided by the user in the form of pairwise constraints. Existing methods use such constraints in…
Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The…
Clustering is a popular unsupervised learning tool often used to discover groups within a larger population such as customer segments, or patient subtypes. However, despite its use as a tool for subgroup discovery and description - few…
Clustering ensemble has emerged as an important research topic in the field of machine learning. Although numerous methods have been proposed to improve clustering quality, most existing approaches overlook the need for interpretability in…
Categorical sequence clustering plays a crucial role in various fields, but the lack of interpretability in cluster assignments poses significant challenges. Sequences inherently lack explicit features, and existing sequence clustering…
Multi-view clustering has become a significant area of research, with numerous methods proposed over the past decades to enhance clustering accuracy. However, in many real-world applications, it is crucial to demonstrate a clear…
Interpretable clustering algorithms aim to group similar data points while explaining the obtained groups to support knowledge discovery and pattern recognition tasks. While most approaches to interpretable clustering construct clusters…
Unsupervised clustering, also known as natural clustering, stands for the classification of data according to their similarities. Here we study this problem from the perspective of complex networks. Mapping the description of data…
We propose a model-based clustering algorithm for a general class of functional data for which the components could be curves or images. The random functional data realizations could be measured with error at discrete, and possibly random,…
The undirected technique for evaluating belief networks [Jensen, et.al., 1990, Lauritzen and Spiegelhalter, 1988] requires clustering the nodes in the network into a junction tree. In the traditional view, the junction tree is constructed…
Clustering is a well-known unsupervised machine learning approach capable of automatically grouping discrete sets of instances with similar characteristics. Constrained clustering is a semi-supervised extension to this process that can be…
Clustering is a fundamental learning task widely used as a first step in data analysis. For example, biologists use cluster assignments to analyze genome sequences, medical records, or images. Since downstream analysis is typically…
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…
Random forests are a machine learning method used to automatically classify datasets and consist of a multitude of decision trees. While these random forests often have higher performance and generalize better than a single decision tree,…
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…