Related papers: A Clustering-based Framework for Classifying Data …

Data Stream Clustering: A Review

Number of connected devices is steadily increasing and these devices continuously generate data streams. Real-time processing of data streams is arousing interest despite many challenges. Clustering is one of the most suitable methods for…

Machine Learning · Computer Science 2020-07-22 Alaettin Zubaroğlu , Volkan Atalay

Clustering Categorical Data Streams

The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams…

Databases · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng , Joshua Zhexue Huang

Active Weighted Aging Ensemble for Drifted Data Stream Classification

One of the significant problems of streaming data classification is the occurrence of concept drift, consisting of the change of probabilistic characteristics of the classification task. This phenomenon destabilizes the performance of the…

Machine Learning · Computer Science 2021-12-21 Michał Woźniak , Paweł Zyblewski , Paweł Ksieniewicz

Stream Clustering using Probabilistic Data Structures

Most density based stream clustering algorithms separate the clustering process into an online and offline component. Exact summarized statistics are being employed for defining micro-clusters or grid cells during the online stage followed…

Databases · Computer Science 2016-12-09 Andrei Sorin Sabau

Data Stream Clustering: Challenges and Issues

Very large databases are required to store massive amounts of data that are continuously inserted and queried. Analyzing huge data sets and extracting valuable pattern in many applications are interesting for researchers. We can identify…

Databases · Computer Science 2010-06-29 Madjid Khalilian , Norwati Mustapha

Efficient Dynamic Clustering: Capturing Patterns from Historical Cluster Evolution

Clustering aims to group unlabeled objects based on similarity inherent among them into clusters. It is important for many tasks such as anomaly detection, database sharding, record linkage, and others. Some clustering methods are taken as…

Databases · Computer Science 2024-12-02 Binbin Gu , Saeed Kargar , Faisal Nawab

Particle Clustering Machine: A Dynamical System Based Approach

Identification of the clusters from an unlabeled data set is one of the most important problems in Unsupervised Machine Learning. The state of the art clustering algorithms are based on either the statistical properties or the geometric…

Machine Learning · Computer Science 2018-01-04 Sambarta Dasgupta , Keivan Ebrahimi , Umesh Vaidya

Streaming Inference for Infinite Non-Stationary Clustering

Learning from a continuous stream of non-stationary data in an unsupervised manner is arguably one of the most common and most challenging settings facing intelligent agents. Here, we attack learning under all three conditions…

Machine Learning · Computer Science 2023-05-23 Rylan Schaeffer , Gabrielle Kaili-May Liu , Yilun Du , Scott Linderman , Ila Rani Fiete

Overview of streaming-data algorithms

Due to recent advances in data collection techniques, massive amounts of data are being collected at an extremely fast pace. Also, these data are potentially unbounded. Boundless streams of data collected from sensors, equipments, and other…

Databases · Computer Science 2012-03-12 T Soni Madhulatha

Model-based clustering of multiple networks with a hierarchical algorithm

The paper tackles the problem of clustering multiple networks, directed or not, that do not share the same set of vertices, into groups of networks with similar topology. A statistical model-based approach based on a finite mixture of…

Statistics Theory · Mathematics 2023-11-07 Tabea Rebafka

Fast Clustering of Short Text Streams Using Efficient Cluster Indexing and Dynamic Similarity Thresholds

Short text stream clustering is an important but challenging task since massive amount of text is generated from different sources such as micro-blogging, question-answering, and social news aggregation websites. One of the major challenges…

Information Retrieval · Computer Science 2021-01-22 Md Rashadul Hasan Rakib , Muhammad Asaduzzaman

Sampling in Dirichlet Process Mixture Models for Clustering Streaming Data

Practical tools for clustering streaming data must be fast enough to handle the arrival rate of the observations. Typically, they also must adapt on the fly to possible lack of stationarity; i.e., the data statistics may be time-dependent…

Machine Learning · Computer Science 2022-03-01 Or Dinari , Oren Freifeld

Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach

Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes.…

Artificial Intelligence · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng

Online Clustering by Penalized Weighted GMM

With the dawn of the Big Data era, data sets are growing rapidly. Data is streaming from everywhere - from cameras, mobile phones, cars, and other electronic devices. Clustering streaming data is a very challenging problem. Unlike the…

Machine Learning · Computer Science 2019-02-08 Shlomo Bugdary , Shay Maymon

CycleCluster: Modernising Clustering Regularisation for Deep Semi-Supervised Classification

Given the potential difficulties in obtaining large quantities of labelled data, many works have explored the use of deep semi-supervised learning, which uses both labelled and unlabelled data to train a neural network architecture. The…

Machine Learning · Computer Science 2021-09-02 Philip Sellars , Angelica Aviles-Rivero , Carola Bibiane Schönlieb

A novel online multi-label classifier for high-speed streaming data applications

In this paper, a high-speed online neural network classifier based on extreme learning machines for multi-label classification is proposed. In multi-label classification, each of the input data sample belongs to one or more than one of the…

Machine Learning · Computer Science 2016-09-06 Rajasekar Venkatesan , Meng Joo Er , Mihika Dave , Mahardhika Pratama , Shiqian Wu

Combining self-labeling and demand based active learning for non-stationary data streams

Learning from non-stationary data streams is a research direction that gains increasing interest as more data in form of streams becomes available, for example from social media, smartphones, or industrial process monitoring. Most…

Machine Learning · Computer Science 2023-02-09 Valerie Vaquet , Fabian Hinder , Johannes Brinkrolf , Barbara Hammer

Correlation Clustering in Data Streams

Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms…

Data Structures and Algorithms · Computer Science 2018-12-06 Kook Jin Ahn , Graham Cormode , Sudipto Guha , Andrew McGregor , Anthony Wirth

A Novel Online Real-time Classifier for Multi-label Data Streams

In this paper, a novel extreme learning machine based online multi-label classifier for real-time data streams is proposed. Multi-label classification is one of the actively researched machine learning paradigm that has gained much…

Machine Learning · Computer Science 2016-09-16 Rajasekar Venkatesan , Meng Joo Er , Shiqian Wu , Mahardhika Pratama

Incremental Learning with Concept Drift Detection and Prototype-based Embeddings for Graph Stream Classification

Data stream mining aims at extracting meaningful knowledge from continually evolving data streams, addressing the challenges posed by nonstationary environments, particularly, concept drift which refers to a change in the underlying data…

Machine Learning · Computer Science 2025-01-03 Kleanthis Malialis , Jin Li , Christos G. Panayiotou , Marios M. Polycarpou