Related papers: Approximating Clustering for Memory Management and…
With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…
Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited…
Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k-means has gained popularity due to its simple iterative nature and ease…
Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We…
Time-series clustering serves as a powerful data mining technique for time-series data in the absence of prior knowledge about clusters. A large amount of time-series data with large size has been acquired and used in various research…
Mining Time Series data has a tremendous growth of interest in today's world. To provide an indication various implementations are studied and summarized to identify the different problems in existing applications. Clustering time series is…
The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams…
We present methods for k-means clustering on a stream with a focus on providing fast responses to clustering queries. Compared to the current state-of-the-art, our methods provide substantial improvement in the query time for cluster…
Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes.…
Clustering is a core task in machine learning with wide-ranging applications in data mining and pattern recognition. However, its unsupervised nature makes it inherently challenging. Many existing clustering algorithms suffer from critical…
Measuring similarity between two objects is the core operation in existing clustering algorithms in grouping similar objects into clusters. This paper introduces a new similarity measure called point-set kernel which computes the similarity…
Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive…
As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…
Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Clustering is the process of grouping similar…
Cluster analysis plays an important role in decision making process for many knowledge-based systems. There exist a wide variety of different approaches for clustering applications including the heuristic techniques, probabilistic models,…
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…
As the data size in Machine Learning fields grows exponentially, it is inevitable to accelerate the computation by utilizing the ever-growing large number of available cores provided by high-performance computing hardware. However, existing…
Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…
We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…
One basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic…