Related papers: Classification dynamique d'un flux documentaire : …

Document stream clustering: experimenting an incremental algorithm and AR-based tools for highlighting dynamic trends

We address here two major challenges presented by dynamic data mining: 1) the stability challenge: we have implemented a rigorous incremental density-based clustering algorithm, independent from any initial conditions and ordering of the…

Artificial Intelligence · Computer Science 2008-11-04 Alain Lelu , Martine Cadot , Pascal Cuxac

A Clustering-based Framework for Classifying Data Streams

The non-stationary nature of data streams strongly challenges traditional machine learning techniques. Although some solutions have been proposed to extend traditional machine learning techniques for handling data streams, these approaches…

Machine Learning · Computer Science 2021-06-23 Xuyang Yan , Abdollah Homaifar , Mrinmoy Sarkar , Abenezer Girma , Edward Tunstel

StruClus: Structural Clustering of Large-Scale Graph Databases

We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…

Databases · Computer Science 2016-10-03 Till Schäfer , Petra Mutzel

Stream Clustering using Probabilistic Data Structures

Most density based stream clustering algorithms separate the clustering process into an online and offline component. Exact summarized statistics are being employed for defining micro-clusters or grid cells during the online stage followed…

Databases · Computer Science 2016-12-09 Andrei Sorin Sabau

Incremental Gaussian Mixture Clustering for Data Streams

The problem of analyzing data streams of very large volumes is important and is very desirable for many application domains. In this paper we present and demonstrate effective working of an algorithm to find clusters and anomalous data…

Machine Learning · Computer Science 2025-03-25 Aniket Bhanderi , Raj Bhatnagar

Improved Multi-objective Data Stream Clustering with Time and Memory Optimization

The analysis of data streams has received considerable attention over the past few decades due to sensors, social media, etc. It aims to recognize patterns in an unordered, infinite, and evolving stream of observations. Clustering this type…

Machine Learning · Computer Science 2022-01-14 Mohammed Oualid Attaoui , Hanene Azzag , Mustapha Lebbah , Nabil Keskes

Clustering Categorical Data Streams

The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams…

Databases · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng , Joshua Zhexue Huang

Clustering Dynamic Web Usage Data

Most classification methods are based on the assumption that data conforms to a stationary distribution. The machine learning domain currently suffers from a lack of classification techniques that are able to detect the occurrence of a…

Machine Learning · Statistics 2012-01-05 Alzennyr Da Silva , Yves Lechevallier , Fabrice Rossi , Francisco De A. T. De Carvahlo

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

A functional clustering algorithm for the analysis of dynamic network data

We formulate a novel technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines…

Neurons and Cognition · Quantitative Biology 2015-05-13 S. Feldt , J. Waddell , V. L. Hetrick , J. D. Berke , M. Zochowski

An Analytical Approach to Document Clustering Based on Internal Criterion Function

Fast and high quality document clustering is an important task in organizing information, search engine results obtaining from user query, enhancing web crawling and information retrieval. With the large amount of data available and with a…

Information Retrieval · Computer Science 2010-03-11 Alok Ranjan , Harish Verma , Eatesh Kandpal , Joydip Dhar

Overview of streaming-data algorithms

Due to recent advances in data collection techniques, massive amounts of data are being collected at an extremely fast pace. Also, these data are potentially unbounded. Boundless streams of data collected from sensors, equipments, and other…

Databases · Computer Science 2012-03-12 T Soni Madhulatha

Correlation Clustering in Data Streams

Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms…

Data Structures and Algorithms · Computer Science 2018-12-06 Kook Jin Ahn , Graham Cormode , Sudipto Guha , Andrew McGregor , Anthony Wirth

Clustering evolving data using kernel-based methods

In this thesis, we propose several modelling strategies to tackle evolving data in different contexts. In the framework of static clustering, we start by introducing a soft kernel spectral clustering (SKSC) algorithm, which can better deal…

Social and Information Networks · Computer Science 2014-11-24 Rocco Langone

Ranking and benchmarking framework for sampling algorithms on synthetic data streams

In the fields of big data, AI, and streaming processing, we work with large amounts of data from multiple sources. Due to memory and network limitations, we process data streams on distributed systems to alleviate computational and network…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-18 József Dániel Gáspár , Martin Horváth , Győző Horváth , Zoltán Zvara

Clustering-based Partitioning for Large Web Graphs

Graph partitioning plays a vital role in distributedlarge-scale web graph analytics, such as pagerank and labelpropagation. The quality and scalability of partitioning strategyhave a strong impact on such communication- and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-04 Deyu Kong , Xike Xie , Zhuoxu Zhang

Data Stream Clustering: A Review

Number of connected devices is steadily increasing and these devices continuously generate data streams. Real-time processing of data streams is arousing interest despite many challenges. Clustering is one of the most suitable methods for…

Machine Learning · Computer Science 2020-07-22 Alaettin Zubaroğlu , Volkan Atalay

On Graph Stream Clustering with Side Information

Graph clustering becomes an important problem due to emerging applications involving the web, social networks and bio-informatics. Recently, many such applications generate data in the form of streams. Clustering massive, dynamic graph…

Databases · Computer Science 2013-01-30 Yuchen Zhao , Philip S. Yu

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

Clustering by latent dimensions

This paper introduces a new clustering technique, called {\em dimensional clustering}, which clusters each data point by its latent {\em pointwise dimension}, which is a measure of the dimensionality of the data set local to that point.…

Machine Learning · Statistics 2018-05-29 Shohei Hidaka , Neeraj Kashyap