Related papers: Document stream clustering: experimenting an incre…
Very large databases are required to store massive amounts of data that are continuously inserted and queried. Analyzing huge data sets and extracting valuable pattern in many applications are interesting for researchers. We can identify…
Number of connected devices is steadily increasing and these devices continuously generate data streams. Real-time processing of data streams is arousing interest despite many challenges. Clustering is one of the most suitable methods for…
A text stream is an ordered sequence of text documents generated over time. A massive amount of such text data is generated by online social platforms every day. Designing an algorithm for such text streams to extract useful information is…
The non-stationary nature of data streams strongly challenges traditional machine learning techniques. Although some solutions have been proposed to extend traditional machine learning techniques for handling data streams, these approaches…
A growing number of applications that generate massive streams of data need intelligent data processing and online analysis. Real-time surveillance systems, telecommunication systems, sensor networks and other dynamic environments are such…
Data-stream clustering is an ever-expanding subdomain of knowledge extraction. Most of the past and present research effort aims at efficient scaling up for the huge data repositories. Our approach focuses on qualitative improvement, mainly…
Due to recent advances in data collection techniques, massive amounts of data are being collected at an extremely fast pace. Also, these data are potentially unbounded. Boundless streams of data collected from sensors, equipments, and other…
Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms…
The growth in Internet usage has contributed to a large volume of continuously available data, and has created the need for automatic and efficient organization of the data. In this context, text clustering techniques are significant…
Mining Time Series data has a tremendous growth of interest in today's world. To provide an indication various implementations are studied and summarized to identify the different problems in existing applications. Clustering time series is…
The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams…
The problem of analyzing data streams of very large volumes is important and is very desirable for many application domains. In this paper we present and demonstrate effective working of an algorithm to find clusters and anomalous data…
This paper presents a model for a dynamical system where particles dominate edges in a complex network. The proposed dynamical system is then extended to an application on the problem of community detection and data clustering. In the case…
This paper presents an evolutionary algorithm for modeling the arrival dates of document streams, which is any time-stamped collection of documents, such as newscasts, e-mails, IRC conversations, scientific journals archives and weblog…
Fast and high quality document clustering is an important task in organizing information, search engine results obtaining from user query, enhancing web crawling and information retrieval. With the large amount of data available and with a…
Stream clustering is a fundamental problem in many streaming data analysis applications. Comparing to classical batch-mode clustering, there are two key challenges in stream clustering: (i) Given that input data are changing continuously,…
Clustering aims to group unlabeled objects based on similarity inherent among them into clusters. It is important for many tasks such as anomaly detection, database sharding, record linkage, and others. Some clustering methods are taken as…
Clustering is a widely used unsupervised learning method for finding structure in the data. However, the resulting clusters are typically presented without any guarantees on their robustness; slightly changing the used data sample or…
Data stream algorithms tackle operations on high-volume sequences of read-once data items. Data stream scenarios include inherently real-time systems like sensor networks and financial markets. They also arise in purely-computational…
Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next…