English
Related papers

Related papers: Topic Model Supervised by Understanding Map

200 papers

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a…

Computation and Language · Computer Science 2019-04-12 Eleftheria Briakou , Nikos Athanasiou , Alexandros Potamianos

Topic Modeling is a popular statistical tool commonly used on textual data to identify the hidden thematic structure in a document collection based on the distribution of words. Additionally, it can be used to cluster the documents, with…

Applications · Statistics 2025-01-24 Namitha V. Pais , Scott H. Holan , Paul A. Parker

Classic Topic Models are built under the Bag Of Words assumption, in which word position is ignored for simplicity. Besides, symmetric priors are typically used in most applications. In order to easily learn topics with different properties…

Computation and Language · Computer Science 2018-06-27 Simón Roca-Sotelo , Jerónimo Arenas-García

As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge…

Information Retrieval · Computer Science 2015-07-20 Samuel Rönnqvist

An important aspect of text mining involves information retrieval in form of discovery of semantic themes (topics) from documents using topic modelling. While generative topic models like Latent Dirichlet Allocation (LDA) or Latent Semantic…

Machine Learning · Computer Science 2025-11-04 Satyajeet Sahoo , Jhareswar Maiti

To date, there have been massive Semi-Structured Documents (SSDs) during the evolution of the Internet. These SSDs contain both unstructured features (e.g., plain text) and metadata (e.g., tags). Most previous works focused on modeling the…

Computation and Language · Computer Science 2015-07-31 Shuangyin Li , Jiefei Li , Guan Huang , Ruiyang Tan , Rong Pan

The paper proposes a text-mining based analytical framework aiming at the cognitive organization of complex scientific discourses. The approach is based on models recently developed in science mapping, being a generalization of the…

Digital Libraries · Computer Science 2015-04-23 Sandor Soos

We propose a novel end-to-end document understanding model called SeRum (SElective Region Understanding Model) for extracting meaningful information from document images, including document analysis, retrieval, and office automation. Unlike…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Haoyu Cao , Changcun Bao , Chaohu Liu , Huang Chen , Kun Yin , Hao Liu , Yinsong Liu , Deqiang Jiang , Xing Sun

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a…

Machine Learning · Statistics 2018-07-20 Martin Gerlach , Tiago P. Peixoto , Eduardo G. Altmann

We proposed a novel multilayer correlated topic model (MCTM) to analyze how the main ideas inherit and vary between a document and its different segments, which helps understand an article's structure. The variational…

Information Retrieval · Computer Science 2021-01-07 Ye Tian

Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in language models, existing approaches incorporate…

Computation and Language · Computer Science 2023-06-28 Yatin Chaudhary , Hinrich Schütze , Pankaj Gupta

Topic models have been widely used to learn text representations and gain insight into document corpora. To perform topic discovery, most existing neural models either take document bag-of-words (BoW) or sequence of tokens as input followed…

Computation and Language · Computer Science 2021-07-12 Madhur Panwar , Shashank Shailabh , Milan Aggarwal , Balaji Krishnamurthy

We introduce CEMTM, a context-enhanced multimodal topic model designed to infer coherent and interpretable topic structures from both short and long documents containing text and images. CEMTM builds on fine-tuned large vision language…

Computation and Language · Computer Science 2025-10-07 Amirhossein Abaskohi , Raymond Li , Chuyuan Li , Shafiq Joty , Giuseppe Carenini

We explore a matrix-space model, that is a natural extension to the vector space model for Information Retrieval. Each document can be represented by a matrix that is based on document extracts (e.g. sentences, paragraphs, sections). We…

Information Retrieval · Computer Science 2007-05-23 Ioannis Antonellis , Efstratios Gallopoulos

Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next…

Information Retrieval · Computer Science 2011-12-30 Muhammad Rafi , M. Shahid Shaikh , Amir Farooq

The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent distributions of words (topics) that are both semantically and syntactically coherent. The STM models dependency parsed corpora where…

Computation and Language · Computer Science 2010-03-04 Jordan Boyd-Graber , David M. Blei

As language models (LMs) deliver increasing performance on a range of NLP tasks, probing classifiers have become an indispensable technique in the effort to better understand their inner workings. A typical setup involves (1) defining an…

Computation and Language · Computer Science 2024-08-01 Charles Jin , Martin Rinard

We propose a Topic Compositional Neural Language Model (TCNLM), a novel method designed to simultaneously capture both the global semantic meaning and the local word ordering structure in a document. The TCNLM learns the global semantic…

Machine Learning · Computer Science 2018-02-27 Wenlin Wang , Zhe Gan , Wenqi Wang , Dinghan Shen , Jiaji Huang , Wei Ping , Sanjeev Satheesh , Lawrence Carin

Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can potentially discover a broad range of themes in a data set,…

Artificial Intelligence · Computer Science 2008-08-08 Chaitanya Chemudugunta , Padhraic Smyth , Mark Steyvers

Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic…

Information Retrieval · Computer Science 2019-07-12 Adji B. Dieng , Francisco J. R. Ruiz , David M. Blei
‹ Prev 1 2 3 10 Next ›