Related papers: Document clustering using graph based document rep…

Document Clustering based on Topic Maps

Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next…

Information Retrieval · Computer Science 2011-12-30 Muhammad Rafi , M. Shahid Shaikh , Amir Farooq

A comparison of two suffix tree-based document clustering algorithms

Document clustering as an unsupervised approach extensively used to navigate, filter, summarize and manage large collection of document repositories like the World Wide Web (WWW). Recently, focuses in this domain shifted from traditional…

Information Retrieval · Computer Science 2012-01-11 Muhammad Rafi , M. Maujood , M. M. Fazal , S. M. Ali

Attributed Graph Clustering in Collaborative Settings

Graph clustering is an unsupervised machine learning method that partitions the nodes in a graph into different groups. Despite achieving significant progress in exploiting both attributed and structured data information, graph clustering…

Machine Learning · Computer Science 2025-01-03 Rui Zhang , Xiaoyang Hou , Zhihua Tian , Yan he , Enchao Gong , Jian Liu , Qingbiao Wu , Kui Ren

StruClus: Structural Clustering of Large-Scale Graph Databases

We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…

Databases · Computer Science 2016-10-03 Till Schäfer , Petra Mutzel

Document clustering with evolved multiword search queries

Text clustering holds significant value across various domains due to its ability to identify patterns and group related information. Current approaches which rely heavily on a computed similarity measure between documents are often limited…

Information Retrieval · Computer Science 2025-04-09 Laurence Hirsch , Robin Hirsch , Bayode Ogunleye

A Survey of Deep Graph Clustering: Taxonomy, Challenge, Application, and Open Resource

Graph clustering, which aims to divide nodes in the graph into several distinct clusters, is a fundamental yet challenging task. Benefiting from the powerful representation capability of deep learning, deep graph clustering methods have…

Machine Learning · Computer Science 2023-09-13 Yue Liu , Jun Xia , Sihang Zhou , Xihong Yang , Ke Liang , Chenchen Fan , Yan Zhuang , Stan Z. Li , Xinwang Liu , Kunlun He

Search Result Clustering in Collaborative Sound Collections

The large size of nowadays' online multimedia databases makes retrieving their content a difficult and time-consuming task. Users of online sound collections typically submit search queries that express a broad intent, often making the…

Information Retrieval · Computer Science 2020-06-16 Xavier Favory , Frederic Font , Xavier Serra

Clustering Document Parts: Detecting and Characterizing Influence Campaigns from Documents

We propose a novel clustering pipeline to detect and characterize influence campaigns from documents. This approach clusters parts of document, detects clusters that likely reflect an influence campaign, and then identifies documents linked…

Computation and Language · Computer Science 2024-04-30 Zhengxiang Wang , Owen Rambow

Hierarchical Clustering with Structural Constraints

Hierarchical clustering is a popular unsupervised data analysis method. For many real-world applications, we would like to exploit prior information about the data that imposes constraints on the clustering hierarchy, and is not captured by…

Data Structures and Algorithms · Computer Science 2018-07-17 Vaggos Chatziafratis , Rad Niazadeh , Moses Charikar

Seeking the Truth Beyond the Data. An Unsupervised Machine Learning Approach

Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The…

Machine Learning · Statistics 2023-10-20 Dimitrios Saligkaras , Vasileios E. Papageorgiou

A Clustering-based Framework for Classifying Data Streams

The non-stationary nature of data streams strongly challenges traditional machine learning techniques. Although some solutions have been proposed to extend traditional machine learning techniques for handling data streams, these approaches…

Machine Learning · Computer Science 2021-06-23 Xuyang Yan , Abdollah Homaifar , Mrinmoy Sarkar , Abenezer Girma , Edward Tunstel

A Survey on optimization approaches to text document clustering

Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that…

Information Retrieval · Computer Science 2014-01-13 R. Jensi , Dr. G. Wiselin Jiji

Semi-Supervised Constrained Clustering: An In-Depth Overview, Ranked Taxonomy and Future Research Directions

Clustering is a well-known unsupervised machine learning approach capable of automatically grouping discrete sets of instances with similar characteristics. Constrained clustering is a semi-supervised extension to this process that can be…

Machine Learning · Computer Science 2023-03-02 Germán González-Almagro , Daniel Peralta , Eli De Poorter , José-Ramón Cano , Salvador García

Towards Clustering-friendly Representations: Subspace Clustering via Graph Filtering

Finding a suitable data representation for a specific task has been shown to be crucial in many applications. The success of subspace clustering depends on the assumption that the data can be separated into different subspaces. However,…

Computer Vision and Pattern Recognition · Computer Science 2021-06-21 Zhengrui Ma , Zhao Kang , Guangchun Luo , Ling Tian

Semantic Document Clustering on Named Entity Features

Keyword-based information processing has limitations due to simple treatment of words. In this paper, we introduce named entities as objectives into document clustering, which are the key elements defining document semantics and in many…

Information Retrieval · Computer Science 2018-07-23 Tru H. Cao , Vuong M. Ngo , Dung T. Hong , Tho T. Quan

An Analytical Approach to Document Clustering Based on Internal Criterion Function

Fast and high quality document clustering is an important task in organizing information, search engine results obtaining from user query, enhancing web crawling and information retrieval. With the large amount of data available and with a…

Information Retrieval · Computer Science 2010-03-11 Alok Ranjan , Harish Verma , Eatesh Kandpal , Joydip Dhar

Clustering-aware Graph Construction: A Joint Learning Perspective

Graph-based clustering methods have demonstrated the effectiveness in various applications. Generally, existing graph-based clustering methods first construct a graph to represent the input data and then partition it to generate the…

Machine Learning · Computer Science 2019-12-17 Yuheng Jia , Hui Liu , Junhui Hou , Sam Kwong

Clustering Unstructured Data (Flat Files) - An Implementation in Text Mining Tool

With the advancement of technology and reduced storage costs, individuals and organizations are tending towards the usage of electronic media for storing textual information and documents. It is time consuming for readers to retrieve…

Information Retrieval · Computer Science 2010-07-27 Yasir Safeer , Atika Mustafa , Anis Noor Ali

MRGSEM-Sum: An Unsupervised Multi-document Summarization Framework based on Multi-Relational Graphs and Structural Entropy Minimization

The core challenge faced by multi-document summarization is the complexity of relationships among documents and the presence of information redundancy. Graph clustering is an effective paradigm for addressing this issue, as it models the…

Computation and Language · Computer Science 2025-08-01 Yongbing Zhang , Fang Nan , Shengxiang Gao , Yuxin Huang , Kaiwen Tan , Zhengtao Yu

Constrained Hierarchical Clustering via Graph Coarsening and Optimal Cuts

Motivated by extracting and summarizing relevant information in short sentence settings, such as satisfaction questionnaires, hotel reviews, and X/Twitter, we study the problem of clustering words in a hierarchical fashion. In particular,…

Machine Learning · Computer Science 2023-12-08 Eliabelle Mauduit , Andrea Simonetto