Related papers: Ordering-sensitive and Semantic-aware Topic Modeli…

Multivariate Gaussian Topic Modelling: A novel approach to discover topics with greater semantic coherence

An important aspect of text mining involves information retrieval in form of discovery of semantic themes (topics) from documents using topic modelling. While generative topic models like Latent Dirichlet Allocation (LDA) or Latent Semantic…

Machine Learning · Computer Science 2025-11-04 Satyajeet Sahoo , Jhareswar Maiti

Gaussian Mixture Embeddings for Multiple Word Prototypes

Recently, word representation has been increasingly focused on for its excellent properties in representing the word semantics. Previous works mainly suffer from the problem of polysemy phenomenon. To address this problem, most of previous…

Computation and Language · Computer Science 2015-11-20 Xinchi Chen , Xipeng Qiu , Jingxiang Jiang , Xuanjing Huang

Neural Topic Modeling by Incorporating Document Relationship Graph

Graph Neural Networks (GNNs) that capture the relationships between graph nodes via message passing have been a hot research direction in the natural language processing community. In this paper, we propose Graph Topic Model (GTM), a GNN…

Computation and Language · Computer Science 2020-09-30 Deyu Zhou , Xuemeng Hu , Rui Wang

A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings. In particular, we assume that global latent topics are shared across documents, a word is generated…

Computation and Language · Computer Science 2020-08-12 Lixing Zhu , Yulan He , Deyu Zhou

Variational Gaussian Topic Model with Invertible Neural Projections

Neural topic models have triggered a surge of interest in extracting topics from text automatically since they avoid the sophisticated derivations in conventional topic models. However, scarce neural topic models incorporate the word…

Artificial Intelligence · Computer Science 2021-05-24 Rui Wang , Deyu Zhou , Yuxuan Xiong , Haiping Huang

TAN-NTM: Topic Attention Networks for Neural Topic Modeling

Topic models have been widely used to learn text representations and gain insight into document corpora. To perform topic discovery, most existing neural models either take document bag-of-words (BoW) or sequence of tokens as input followed…

Computation and Language · Computer Science 2021-07-12 Madhur Panwar , Shashank Shailabh , Milan Aggarwal , Balaji Krishnamurthy

Gaussian Process Topic Models

We introduce Gaussian Process Topic Models (GPTMs), a new family of topic models which can leverage a kernel among documents while extracting correlated topics. GPTMs can be considered a systematic generalization of the Correlated Topic…

Machine Learning · Computer Science 2012-03-19 Amrudin Agovic , Arindam Banerjee

LTSG: Latent Topical Skip-Gram for Mutually Learning Topic Model and Vector Representations

Topic models have been widely used in discovering latent topics which are shared across documents in text mining. Vector representations, word embeddings and topic embeddings, map words and topics into a low-dimensional and dense real-value…

Computation and Language · Computer Science 2017-02-24 Jarvan Law , Hankz Hankui Zhuo , Junhua He , Erhu Rong

Context Reinforced Neural Topic Modeling over Short Texts

As one of the prevalent topic mining tools, neural topic modeling has attracted a lot of interests for the advantages of high efficiency in training and strong generalisation abilities. However, due to the lack of context in each short…

Information Retrieval · Computer Science 2020-08-12 Jiachun Feng , Zusheng Zhang , Cheng Ding , Yanghui Rao , Haoran Xie

Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document…

Machine Learning · Computer Science 2022-03-16 Dongsheng Wang , Dandan Guo , He Zhao , Huangjie Zheng , Korawat Tanwisuth , Bo Chen , Mingyuan Zhou

Topic Compositional Neural Language Model

We propose a Topic Compositional Neural Language Model (TCNLM), a novel method designed to simultaneously capture both the global semantic meaning and the local word ordering structure in a document. The TCNLM learns the global semantic…

Machine Learning · Computer Science 2018-02-27 Wenlin Wang , Zhe Gan , Wenqi Wang , Dinghan Shen , Jiaji Huang , Wei Ping , Sanjeev Satheesh , Lawrence Carin

TopicNet: Semantic Graph-Guided Topic Discovery

Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner and automatically organize them into a topic hierarchy. However, it is unclear how to incorporate prior…

Machine Learning · Computer Science 2021-10-28 Zhibin Duan , Yishi Xu , Bo Chen , Dongsheng Wang , Chaojie Wang , Mingyuan Zhou

Is Neural Topic Modelling Better than Clustering? An Empirical Study on Clustering with Contextual Embeddings for Topics

Recent work incorporates pre-trained word embeddings such as BERT embeddings into Neural Topic Models (NTMs), generating highly coherent topics. However, with high-quality contextualized document representations, do we really need…

Computation and Language · Computer Science 2022-04-22 Zihan Zhang , Meng Fang , Ling Chen , Mohammad-Reza Namazi-Rad

Modeling Concentrated Cross-Attention for Neural Machine Translation with Gaussian Mixture Model

Cross-attention is an important component of neural machine translation (NMT), which is always realized by dot-product attention in previous methods. However, dot-product attention only considers the pair-wise correlation between words,…

Computation and Language · Computer Science 2021-09-15 Shaolei Zhang , Yang Feng

Towards Generalising Neural Topical Representations

Topic models have evolved from conventional Bayesian probabilistic models to recent Neural Topic Models (NTMs). Although NTMs have shown promising performance when trained and tested on a specific corpus, their generalisation ability across…

Computation and Language · Computer Science 2024-06-14 Xiaohao Yang , He Zhao , Dinh Phung , Lan Du

Topic Modeling in Embedding Spaces

Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic…

Information Retrieval · Computer Science 2019-07-12 Adji B. Dieng , Francisco J. R. Ruiz , David M. Blei

Probabilistic Topic Modelling with Transformer Representations

Topic modelling was mostly dominated by Bayesian graphical models during the last decade. With the rise of transformers in Natural Language Processing, however, several successful models that rely on straightforward clustering approaches in…

Machine Learning · Computer Science 2024-03-07 Arik Reuter , Anton Thielmann , Christoph Weisser , Benjamin Säfken , Thomas Kneib

Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection

One of the long-standing challenges in lexical semantics consists in learning representations of words which reflect their semantic properties. The remarkable success of word embeddings for this purpose suggests that high-quality…

Computation and Language · Computer Science 2021-06-16 Yixiao Wang , Zied Bouraoui , Luis Espinosa Anke , Steven Schockaert

Neural Attention-Aware Hierarchical Topic Model

Neural topic models (NTMs) apply deep neural networks to topic modelling. Despite their success, NTMs generally ignore two important aspects: (1) only document-level word count information is utilized for the training, while more…

Computation and Language · Computer Science 2021-10-15 Yuan Jin , He Zhao , Ming Liu , Lan Du , Wray Buntine

Topics as Entity Clusters: Entity-based Topics from Large Language Models and Graph Neural Networks

Topic models aim to reveal latent structures within a corpus of text, typically through the use of term-frequency statistics over bag-of-words representations from documents. In recent years, conceptual entities -- interpretable,…

Computation and Language · Computer Science 2024-08-27 Manuel V. Loureiro , Steven Derby , Tri Kurniawan Wijaya