Related papers: Proposition-Level Clustering for Multi-Document Su…

Mining both Commonality and Specificity from Multiple Documents for Multi-Document Summarization

The multi-document summarization task requires the designed summarizer to generate a short text that covers the important information of original documents and satisfies content diversity. This paper proposes a multi-document summarization…

Computation and Language · Computer Science 2023-03-07 Bing Ma

Multi-document Summarization via Deep Learning Techniques: A Survey

Multi-document summarization (MDS) is an effective tool for information aggregation that generates an informative and concise summary from a cluster of topic-related documents. Our survey, the first of its kind, systematically overviews the…

Computation and Language · Computer Science 2021-12-10 Congbo Ma , Wei Emma Zhang , Mingyu Guo , Hu Wang , Quan Z. Sheng

CoRank: A clustering cum graph ranking approach for extractive summarization

Online information has increased tremendously in today's age of Internet. As a result, the need has arose to extract relevant content from the plethora of available information. Researchers are widely using automatic text summarization…

Social and Information Networks · Computer Science 2021-06-02 Mohd Khizir Siddiqui , Amreen Ahmad , Om Pal , Tanvir Ahmad

A Survey on optimization approaches to text document clustering

Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that…

Information Retrieval · Computer Science 2014-01-13 R. Jensi , Dr. G. Wiselin Jiji

JADS: A Framework for Self-supervised Joint Aspect Discovery and Summarization

To generate summaries that include multiple aspects or topics for text documents, most approaches use clustering or topic modeling to group relevant sentences and then generate a summary for each group. These approaches struggle to optimize…

Artificial Intelligence · Computer Science 2024-05-30 Xiaobo Guo , Jay Desai , Srinivasan H. Sengamedu

Information Retrieval in long documents: Word clustering approach for improving Semantics

In this paper, we propose an alternative to deep neural networks for semantic information retrieval for the case of long documents. This new approach exploiting clustering techniques to take into account the meaning of words in Information…

Information Retrieval · Computer Science 2025-07-29 Paul Mbathe Mekontchou , Armel Fotsoh , Bernabe Batchakui , Eddy Ella

Extractive Multi-document Summarization Using Multilayer Networks

Huge volumes of textual information has been produced every single day. In order to organize and understand such large datasets, in recent years, summarization techniques have become popular. These techniques aims at finding relevant,…

Computation and Language · Computer Science 2018-03-26 Jorge V. Tohalino , Diego R. Amancio

SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression

Obtaining training data for multi-document summarization (MDS) is time consuming and resource-intensive, so recent neural models can only be trained for limited domains. In this paper, we propose SummPip: an unsupervised method for…

Computation and Language · Computer Science 2020-07-21 Jinming Zhao , Ming Liu , Longxiang Gao , Yuan Jin , Lan Du , He Zhao , He Zhang , Gholamreza Haffari

Document Clustering using K-Means and K-Medoids

With the huge upsurge of information in day-to-days life, it has become difficult to assemble relevant information in nick of time. But people, always are in dearth of time, they need everything quick. Hence clustering was introduced to…

Information Retrieval · Computer Science 2015-03-02 Rakesh Chandra Balabantaray , Chandrali Sarma , Monica Jha

Resampling methods for document clustering

We compare the performance of different clustering algorithms applied to the task of unsupervised text categorization. We consider agglomerative clustering algorithms, principal direction divisive partitioning and (for the first time)…

Disordered Systems and Neural Networks · Physics 2007-05-23 D. Volk , M. G. Stepanov

Document Clustering based on Topic Maps

Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next…

Information Retrieval · Computer Science 2011-12-30 Muhammad Rafi , M. Shahid Shaikh , Amir Farooq

MRGSEM-Sum: An Unsupervised Multi-document Summarization Framework based on Multi-Relational Graphs and Structural Entropy Minimization

The core challenge faced by multi-document summarization is the complexity of relationships among documents and the presence of information redundancy. Graph clustering is an effective paradigm for addressing this issue, as it models the…

Computation and Language · Computer Science 2025-08-01 Yongbing Zhang , Fang Nan , Shengxiang Gao , Yuxin Huang , Kaiwen Tan , Zhengtao Yu

SgSum: Transforming Multi-document Summarization into Sub-graph Selection

Most of existing extractive multi-document summarization (MDS) methods score each sentence individually and extract salient sentences one by one to compose a summary, which have two main drawbacks: (1) neglecting both the intra and…

Computation and Language · Computer Science 2021-10-26 Moye Chen , Wei Li , Jiachen Liu , Xinyan Xiao , Hua Wu , Haifeng Wang

Generating a Structured Summary of Numerous Academic Papers: Dataset and Method

Writing a survey paper on one research topic usually needs to cover the salient content from numerous related papers, which can be modeled as a multi-document summarization (MDS) task. Existing MDS datasets usually focus on producing the…

Computation and Language · Computer Science 2023-02-10 Shuaiqi Liu , Jiannong Cao , Ruosong Yang , Zhiyuan Wen

Neural Text Classification by Jointly Learning to Cluster and Align

Distributional text clustering delivers semantically informative representations and captures the relevance between each word and semantic clustering centroids. We extend the neural text clustering approach to text classification tasks by…

Computation and Language · Computer Science 2020-11-25 Yekun Chai , Haidong Zhang , Shuo Jin

Document clustering with evolved multiword search queries

Text clustering holds significant value across various domains due to its ability to identify patterns and group related information. Current approaches which rely heavily on a computed similarity measure between documents are often limited…

Information Retrieval · Computer Science 2025-04-09 Laurence Hirsch , Robin Hirsch , Bayode Ogunleye

Markov-Enhanced Clustering for Long Document Summarization: Tackling the 'Lost in the Middle' Challenge with Large Language Models

The rapid expansion of information from diverse sources has heightened the need for effective automatic text summarization, which condenses documents into shorter, coherent texts. Summarization methods generally fall into two categories:…

Computation and Language · Computer Science 2025-06-24 Aziz Amari , Mohamed Achref Ben Ammar

A comparison of two suffix tree-based document clustering algorithms

Document clustering as an unsupervised approach extensively used to navigate, filter, summarize and manage large collection of document repositories like the World Wide Web (WWW). Recently, focuses in this domain shifted from traditional…

Information Retrieval · Computer Science 2012-01-11 Muhammad Rafi , M. Maujood , M. M. Fazal , S. M. Ali

Multi-document abstractive summarization using ILP based multi-sentence compression

Abstractive summarization is an ideal form of summarization since it can synthesize information from multiple documents to create concise informative summaries. In this work, we aim at developing an abstractive summarizer. First, our…

Computation and Language · Computer Science 2016-09-23 Siddhartha Banerjee , Prasenjit Mitra , Kazunari Sugiyama

Improving Multi-Document Summarization via Text Classification

Developed so far, multi-document summarization has reached its bottleneck due to the lack of sufficient training data and diverse categories of documents. Text classification just makes up for these deficiencies. In this paper, we propose a…

Computation and Language · Computer Science 2016-11-29 Ziqiang Cao , Wenjie Li , Sujian Li , Furu Wei