Related papers: Clustering Without (Thinking About) Triangulation
A computational theory for clustering and a semi-supervised clustering algorithm is presented. Clustering is defined to be the obtainment of groupings of data such that each group contains no anomalies with respect to a chosen grouping…
Clustering is a fundamental problem in network analysis that finds closely connected groups of nodes and separates them from other nodes in the graph, while link prediction is to predict whether two nodes in a network are likely to have a…
The determination of cluster centers generally depends on the scale that we use to analyze the data to be clustered. Inappropriate scale usually leads to unreasonable cluster centers and thus unreasonable results. In this study, we first…
Clustering is one of the fundamental tasks in data analytics and machine learning. In many situations, different clusterings of the same data set become relevant. For example, different algorithms for the same clustering task may return…
Clustering is the propensity of nodes that share a common neighbour to be connected. It is ubiquitous in many networks but poses many modelling challenges. Clustering typically manifests itself by a higher than expected frequency of…
We herein introduce a new method of interpretable clustering that uses unsupervised binary trees. It is a three-stage procedure, the first stage of which entails a series of recursive binary splits to reduce the heterogeneity of the data…
The paper deals with optimality issues in connection with updating beliefs in networks. We address two processes: triangulation and construction of junction trees. In the first part, we give a simple algorithm for constructing an optimal…
We herein introduce a new method of interpretable clustering that uses unsupervised binary trees. It is a three-stage procedure, the first stage of which entails a series of recursive binary splits to reduce the heterogeneity of the data…
In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial…
This paper describes a process for clustering concepts into chains from data presented randomly to an evaluating system. There are a number of rules or guidelines that help the system to determine more accurately what concepts belong to a…
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search…
Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The…
We study the problem of explainability-first clustering where explainability becomes a first-class citizen for clustering. Previous clustering approaches use decision trees for explanation, but only after the clustering is completed. In…
We propose a model-based clustering algorithm for a general class of functional data for which the components could be curves or images. The random functional data realizations could be measured with error at discrete, and possibly random,…
In the data mining field many clustering methods have been proposed, yet standard versions do not take into account uncertain databases. This paper deals with a new approach to cluster uncertain data by using a hierarchical clustering…
Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated…
Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed - in the sense that there is directionality on…
Fair clustering has gained increasing attention in recent years, especially in applications involving socially sensitive attributes. However, existing fair clustering methods often lack interpretability, limiting their applicability in…
The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on…
Statistical significance of network clustering has been an unresolved problem since it was observed that community detection algorithms produce false positives even in random graphs. After a phase transition between undetectable and…