Related papers: Solving clustering as ill-posed problem: experimen…
Learning augmented is a machine learning concept built to improve the performance of a method or model, such as enhancing its ability to predict and generalize data or features, or testing the reliability of the method by introducing noise…
Principal Component Analysis (PCA) and K-means constitute fundamental techniques in multivariate analysis. Although they are frequently applied independently or sequentially to cluster observations, the relationship between them, especially…
We consider clustering in group decision making where the opinions are given by pairwise comparison matrices. In particular, the k-medoids model is suggested to classify the matrices since it has a linear programming problem formulation…
The k-means algorithm is a partitional clustering method. Over 60 years old, it has been successfully used for a variety of problems. The popularity of k-means is in large part a consequence of its simplicity and efficiency. In this paper…
Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…
In this work, the possibility of clustering correlated random variables was examined, both because of their mutual similarity and because of their similarity to the principal components. The k-means algorithm and spectral algorithms were…
Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…
We develop and analyze a method to reduce the size of a very large set of data points in a high dimensional Euclidean space R d to a small set of weighted points such that the result of a predetermined data analysis task on the reduced set…
In this paper, we study the application of sparse principal component analysis (PCA) to clustering and feature selection problems. Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of…
Among many clustering algorithms, the K-means clustering algorithm is widely used because of its simple algorithm and fast convergence. However, this algorithm suffers from incomplete data, where some samples have missed some of their…
Fast and high quality document clustering is an important task in organizing information, search engine results obtaining from user query, enhancing web crawling and information retrieval. With the large amount of data available and with a…
Clustering is a usual unsupervised machine learning technique for grouping the data points into groups based upon similar features. We focus here on unsupervised clustering for contaminated data, i.e in the case where K-medians should be…
Recent work on deep clustering has found new promising methods also for constrained clustering problems. Their typically pairwise constraints often can be used to guide the partitioning of the data. Many problems however, feature…
The popular K-means clustering algorithm potentially suffers from a major weakness for further analysis or interpretation. Some cluster may have disproportionately more (or fewer) points from one of the subpopulations in terms of some…
Though mostly used as a clustering algorithm, k-means are originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon [21, 33], we try to investigate…
The analysis of continously larger datasets is a task of major importance in a wide variety of scientific fields. In this sense, cluster analysis algorithms are a key element of exploratory data analysis, due to their easiness in the…
K-means plays a vital role in data mining and is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model. However, its performance drastically drops when applied to vast amounts of…
Clustering is an effective technique in data mining to generate groups that are the matter of interest. Among various clustering approaches, the family of k-means algorithms and min-cut algorithms gain most popularity due to their…
Clustering is a separation of data into groups of similar objects. Every group called cluster consists of objects that are similar to one another and dissimilar to objects of other groups. In this paper, the K-Means algorithm is implemented…
K-Means clustering still plays an important role in many computer vision problems. While the conventional Lloyd method, which alternates between centroid update and cluster assignment, is primarily used in practice, it may converge to a…