Related papers: Clustering, Encoding and Diameter Computation Algo…
Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The…
The problem of dimension reduction is of increasing importance in modern data analysis. In this paper, we consider modeling the collection of points in a high dimensional space as a union of low dimensional subspaces. In particular we…
Cluster analysis faces two problems in high dimensions: first, the `curse of dimensionality' that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large…
With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…
In many real-world problems, we are dealing with collections of high-dimensional data, such as images, videos, text and web documents, DNA microarray data, and more. Often, high-dimensional data lie close to low-dimensional structures…
Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes.…
The high dimensionality of hyperspectral images often results in the degradation of clustering performance. Due to the powerful ability of deep feature extraction and non-linear feature representation, the clustering algorithm based on deep…
A hierarchical scheme for clustering data is presented which applies to spaces with a high number of dimension ($N_{_{D}}>3$). The data set is first reduced to a smaller set of partitions (multi-dimensional bins). Multiple clustering…
Clustering is a crucial tool for analyzing data in virtually every scientific and engineering discipline. There are more scalable solutions framed to enable time and space clustering for the future large-scale data analyses. As a result,…
We consider the problem of clustering a set of high-dimensional data points into sets of low-dimensional linear subspaces. The number of subspaces, their dimensions, and their orientations are unknown. We propose a simple and low-complexity…
In modern day industry, clustering algorithms are daily routines of algorithm engineers. Although clustering algorithms experienced rapid growth before 2010. Innovation related to the research topic has stagnated after deep learning became…
Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited…
Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…
Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We…
In this paper we construct multidimensional codes with high dimension. The codes can correct high dimensional errors which have the form of either small clusters, or confined to an area with a small radius. We also consider small number of…
Several methods of triclustering of three dimensional data require the specification of the cluster size in each dimension. This introduces a certain degree of arbitrariness. To address this issue, we propose a new method, namely the…
Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval. There are many different clustering algorithms developed by various communities, and it is often not clear which…
New algorithms are devised for finding the maxima of multidimensional point samples, one of the very first problems studied in computational geometry. The algorithms are very simple and easily coded and modified for practical needs. The…
Clustering is an unsupervised learning problem that aims to partition unlabelled data points into groups with similar features. Traditional clustering algorithms provide limited insight into the groups they find as their main focus is…
Subspace clustering refers to the problem of clustering high-dimensional data points into a union of low-dimensional linear subspaces, where the number of subspaces, their dimensions and orientations are all unknown. In this paper, we…