English
Related papers

Related papers: GraphTMT: Unsupervised Graph-based Topic Modeling …

200 papers

Extracting topics from text has become an essential task, especially with the rapid growth of unstructured textual data. Most existing works rely on highly computational methods to address this challenge. In this paper, we argue that…

Computation and Language · Computer Science 2025-11-07 Salma Mekaoui , Hiba Sofyan , Imane Amaaz , Imane Benchrif , Arsalane Zarghili , Ilham Chaker , Nikola S. Nikolov

Topic modeling is a powerful technique to discover hidden topics and patterns within a collection of documents without prior knowledge. Traditional topic modeling and clustering-based techniques encounter challenges in capturing contextual…

Computation and Language · Computer Science 2024-10-04 Melkamu Abay Mersha , Mesay Gemeda yigezu , Jugal Kalita

Production of news content is growing at an astonishing rate. To help manage and monitor the sheer amount of text, there is an increasing need to develop efficient methods that can provide insights into emerging content areas, and stratify…

Computation and Language · Computer Science 2020-10-29 M. Tarik Altuncu , Sophia N. Yaliraki , Mauricio Barahona

Topic models aim to reveal latent structures within a corpus of text, typically through the use of term-frequency statistics over bag-of-words representations from documents. In recent years, conceptual entities -- interpretable,…

Computation and Language · Computer Science 2024-08-27 Manuel V. Loureiro , Steven Derby , Tri Kurniawan Wijaya

It has been reported that clustering-based topic models, which cluster high-quality sentence embeddings with an appropriate word selection method, can generate better topics than generative probabilistic topic models. However, these…

Computation and Language · Computer Science 2023-06-07 Leihang Zhang , Jiapeng Liu , Qiang Yan

We propose an unsupervised keyphrase extraction model that encodes topical information within a multipartite graph structure. Our model represents keyphrase candidates and topics in a single graph and exploits their mutually reinforcing…

Information Retrieval · Computer Science 2018-04-17 Florian Boudin

Text summarization aims to compress a textual document to a short summary while keeping salient information. Extractive approaches are widely used in text summarization because of their fluency and efficiency. However, most of existing…

Computation and Language · Computer Science 2020-10-14 Peng Cui , Le Hu , Yuanchao Liu

Learning hidden topics from data streams has become absolutely necessary but posed challenging problems such as concept drift as well as short and noisy data. Using prior knowledge to enrich a topic model is one of potential solutions to…

Machine Learning · Computer Science 2021-12-28 Ngo Van Linh , Tran Xuan Bach , Khoat Than

Real time nature of social networks with bursty short messages and their respective large data scale spread among vast variety of topics are research interest of many researchers. These properties of social networks which are known as 5'Vs…

Computation and Language · Computer Science 2021-08-05 Meysam Asgari-Chenaghlu , Mohammad-Reza Feizi-Derakhshi , Leili farzinvash , Mohammad-Ali Balafar , Cina Motamed

Topic modeling is a key component in unsupervised learning, employed to identify topics within a corpus of textual data. The rapid growth of social media generates an ever-growing volume of textual data daily, making online topic modeling…

Machine Learning · Computer Science 2025-10-23 Federica Granese , Benjamin Navet , Serena Villata , Charles Bouveyron

Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner and automatically organize them into a topic hierarchy. However, it is unclear how to incorporate prior…

Machine Learning · Computer Science 2021-10-28 Zhibin Duan , Yishi Xu , Bo Chen , Dongsheng Wang , Chaojie Wang , Mingyuan Zhou

Recent work incorporates pre-trained word embeddings such as BERT embeddings into Neural Topic Models (NTMs), generating highly coherent topics. However, with high-quality contextualized document representations, do we really need…

Computation and Language · Computer Science 2022-04-22 Zihan Zhang , Meng Fang , Ling Chen , Mohammad-Reza Namazi-Rad

Generating video descriptions in natural language (a.k.a. video captioning) is a more challenging task than image captioning as the videos are intrinsically more complicated than images in two aspects. First, videos cover a broader range of…

Computer Vision and Pattern Recognition · Computer Science 2017-09-05 Shizhe Chen , Jia Chen , Qin Jin

Topic models are frequently used in machine learning owing to their high interpretability and modular structure. However, extending a topic model to include a supervisory signal, to incorporate pre-trained word embedding vectors and to…

Machine Learning · Statistics 2019-09-17 Ryohei Hisano

Automatic Text Summarization strategies have been successfully employed to digest text collections and extract its essential content. Usually, summaries are generated using textual corpora that belongs to the same domain area where the…

Computation and Language · Computer Science 2018-07-03 Vinicius Woloszyn , Guilherme Medeiros Machado , Leandro Krug Wives , José Palazzo Moreira de Oliveira

Topic modelling is a pivotal unsupervised machine learning technique for extracting valuable insights from large document collections. Existing neural topic modelling methods often encode contextual information of documents, while ignoring…

Computation and Language · Computer Science 2025-02-07 Yanan Ma , Chenghao Xiao , Chenhan Yuan , Sabine N van der Veer , Lamiece Hassan , Chenghua Lin , Goran Nenadic

Due to the success of the pre-trained language model (PLM), existing PLM-based summarization models show their powerful generative capability. However, these models are trained on general-purpose summarization datasets, leading to generated…

Computation and Language · Computer Science 2023-02-28 Shi Zesheng , Zhou Yucheng

Extractive text summarization aims at extracting the most representative sentences from a given document as its summary. To extract a good summary from a long text document, sentence embedding plays an important role. Recent studies have…

Computation and Language · Computer Science 2021-09-10 Baoyu Jing , Zeyu You , Tao Yang , Wei Fan , Hanghang Tong

Topic modeling is an unsupervised method for revealing the hidden semantic structure of a corpus. It has been increasingly widely adopted as a tool in the social sciences, including political science, digital humanities and sociological…

Information Retrieval · Computer Science 2022-01-12 Zheng Fang , Yulan He , Rob Procter

Existing graph-based methods for extractive document summarization represent sentences of a corpus as the nodes of a graph or a hypergraph in which edges depict relationships of lexical similarity between sentences. Such approaches fail to…

Computation and Language · Computer Science 2019-06-25 Hadrien Van Lierde , Tommy W. S. Chow
‹ Prev 1 2 3 10 Next ›