English
Related papers

Related papers: Topically Driven Neural Language Model

200 papers

Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text…

Computation and Language · Computer Science 2016-06-02 Georgios Balikas , Massih-Reza Amini , Marianne Clausel

Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in language models, existing approaches incorporate…

Computation and Language · Computer Science 2023-06-28 Yatin Chaudhary , Hinrich Schütze , Pankaj Gupta

Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure. Existing language models generally fail to account for discourse structure, but…

Computation and Language · Computer Science 2016-02-23 Yangfeng Ji , Trevor Cohn , Lingpeng Kong , Chris Dyer , Jacob Eisenstein

Document-level machine translation manages to outperform sentence level models by a small margin, but have failed to be widely adopted. We argue that previous research did not make a clear use of the global context, and propose a new…

Computation and Language · Computer Science 2020-09-10 Zaixiang Zheng , Xiang Yue , Shujian Huang , Jiajun Chen , Alexandra Birch

In this paper, we present the Polylingual Labeled Topic Model, a model which combines the characteristics of the existing Polylingual Topic Model and Labeled LDA. The model accounts for multiple languages with separate topic distributions…

Computation and Language · Computer Science 2017-05-03 Lisa Posch , Arnim Bleier , Philipp Schaer , Markus Strohmaier

Topic modelling, as a well-established unsupervised technique, has found extensive use in automatically detecting significant topics within a corpus of documents. However, classic topic modelling approaches (e.g., LDA) have certain…

Computation and Language · Computer Science 2024-03-27 Yida Mu , Chun Dong , Kalina Bontcheva , Xingyi Song

Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks. Generative topic models infer topic-word distributions, taking…

Information Retrieval · Computer Science 2018-08-14 Pankaj Gupta , Florian Buettner , Hinrich Schütze

Topic models such as LDA, DocNADE, iDocNADEe have been popular in document analysis. However, the traditional topic models have several limitations including: (1) Bag-of-words (BoW) assumption, where they ignore word ordering, (2) Data…

Information Retrieval · Computer Science 2019-10-01 Yatin Chaudhary , Pankaj Gupta , Thomas Runkler

Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret.…

Computation and Language · Computer Science 2021-06-18 Federico Bianchi , Silvia Terragni , Dirk Hovy

A recent line of work in natural language processing has aimed to combine language models and topic models. These topic-guided language models augment neural language models with topic models, unsupervised learning methods that can discover…

Computation and Language · Computer Science 2023-12-06 Carolina Zheng , Keyon Vafa , David M. Blei

Topic modeling is a well-established technique for exploring text corpora. Conventional topic models (e.g., LDA) represent topics as bags of words that often require "reading the tea leaves" to interpret; additionally, they offer users…

Computation and Language · Computer Science 2024-04-03 Chau Minh Pham , Alexander Hoyle , Simeng Sun , Philip Resnik , Mohit Iyyer

Despite the known limitations, most machine translation systems today still operate on the sentence-level. One reason for this is, that most parallel training data is only sentence-level aligned, without document-level meta information…

Computation and Language · Computer Science 2023-10-20 Frithjof Petrick , Christian Herold , Pavel Petrushkov , Shahram Khadivi , Hermann Ney

Topic modeling is a widely used technique for revealing underlying thematic structures within textual data. However, existing models have certain limitations, particularly when dealing with short text datasets that lack co-occurring words.…

Artificial Intelligence · Computer Science 2023-12-18 Han Wang , Nirmalendu Prakash , Nguyen Khoi Hoang , Ming Shan Hee , Usman Naseem , Roy Ka-Wei Lee

Hierarchical neural architectures are often used to capture long-distance dependencies and have been applied to many document-level tasks such as summarization, document segmentation, and sentiment analysis. However, effective usage of such…

Computation and Language · Computer Science 2019-01-29 Ming-Wei Chang , Kristina Toutanova , Kenton Lee , Jacob Devlin

Language is typically modelled with discrete sequences. However, the most successful approaches to language modelling, namely neural networks, are continuous and smooth function approximators. In this work, we show that Transformer-based…

Computation and Language · Computer Science 2025-04-08 Samuele Marro , Davide Evangelista , X. Angelo Huang , Emanuele La Malfa , Michele Lombardi , Michael Wooldridge

The training of topic models for a multilingual environment is a challenging task, requiring the use of sophisticated algorithms, topic-aligned corpora, and manual evaluation. These difficulties are further exacerbated when the developer…

Computation and Language · Computer Science 2025-09-03 Felix Engl , Andreas Henrich

Language models are at the heart of numerous works, notably in the text mining and information retrieval communities. These statistical models aim at extracting word distributions, from simple unigram models to recurrent approaches with…

Computation and Language · Computer Science 2020-02-25 Edouard Delasalles , Sylvain Lamprier , Ludovic Denoyer

Topic models aim to reveal latent structures within a corpus of text, typically through the use of term-frequency statistics over bag-of-words representations from documents. In recent years, conceptual entities -- interpretable,…

Computation and Language · Computer Science 2024-08-27 Manuel V. Loureiro , Steven Derby , Tri Kurniawan Wijaya

Advances in topic modeling have yielded effective methods for characterizing the latent semantics of textual data. However, applying standard topic modeling approaches to sentence-level tasks introduces a number of challenges. In this…

Computation and Language · Computer Science 2016-07-21 Ruey-Cheng Chen , Reid Swanson , Andrew S. Gordon

By illuminating latent structures in a corpus of text, topic models are an essential tool for categorizing, summarizing, and exploring large collections of documents. Probabilistic topic models, such as latent Dirichlet allocation (LDA),…

Information Retrieval · Computer Science 2021-12-07 Bahareh Harandizadeh , J. Hunter Priniski , Fred Morstatter
‹ Prev 1 2 3 10 Next ›