Related papers: Syntactic Topic Models

Syntactic Substitutability as Unsupervised Dependency Syntax

Syntax is a latent hierarchical structure which underpins the robust and compositional nature of human language. In this work, we explore the hypothesis that syntactic dependencies can be represented in language model attention…

Computation and Language · Computer Science 2023-10-24 Jasper Jian , Siva Reddy

Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves

We propose Sentence Level Recurrent Topic Model (SLRTM), a new topic model that assumes the generation of each word within a sentence to depend on both the topic of the sentence and the whole history of its preceding words in the sentence.…

Machine Learning · Computer Science 2016-04-11 Fei Tian , Bin Gao , Di He , Tie-Yan Liu

Jointly Learning Word Embeddings and Latent Topics

Word embedding models such as Skip-gram learn a vector-space representation for each word, based on the local word collocation patterns that are observed in a text corpus. Latent topic models, on the other hand, take a more global view,…

Computation and Language · Computer Science 2017-06-23 Bei Shi , Wai Lam , Shoaib Jameel , Steven Schockaert , Kwun Ping Lai

Syntactic Recurrent Neural Network for Authorship Attribution

Writing style is a combination of consistent decisions at different levels of language production including lexical, syntactic, and structural associated to a specific author (or author groups). While lexical-based models have been widely…

Computation and Language · Computer Science 2019-02-28 Fereshteh Jafariakinabad , Sansiri Tarnpradab , Kien A. Hua

Latent Tree Models for Hierarchical Topic Detection

We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree…

Computation and Language · Computer Science 2016-12-22 Peixian Chen , Nevin L. Zhang , Tengfei Liu , Leonard K. M. Poon , Zhourong Chen , Farhan Khawar

A network approach to topic models

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a…

Machine Learning · Statistics 2018-07-20 Martin Gerlach , Tiago P. Peixoto , Eduardo G. Altmann

Attribution Analysis of Grammatical Dependencies in LSTMs

LSTM language models have been shown to capture syntax-sensitive grammatical dependencies such as subject-verb agreement with a high degree of accuracy (Linzen et al., 2016, inter alia). However, questions remain regarding whether they do…

Computation and Language · Computer Science 2020-05-04 Yiding Hao

A Systematic Study of Compositional Syntactic Transformer Language Models

Syntactic language models (SLMs) enhance Transformers by incorporating syntactic biases through the modeling of linearized syntactic parse trees alongside surface sentences. This paper focuses on compositional SLMs that are based on…

Computation and Language · Computer Science 2025-07-01 Yida Zhao , Hao Xve , Xiang Hu , Kewei Tu

Model Directions, Not Words: Mechanistic Topic Models Using Sparse Autoencoders

Traditional topic models are effective at uncovering latent themes in large text collections. However, due to their reliance on bag-of-words representations, they struggle to capture semantically abstract features. While some neural…

Computation and Language · Computer Science 2025-08-01 Carolina Zheng , Nicolas Beltran-Velez , Sweta Karlekar , Claudia Shi , Achille Nazaret , Asif Mallik , Amir Feder , David M. Blei

Predicting the Semantic Textual Similarity with Siamese CNN and LSTM

Semantic Textual Similarity (STS) is the basis of many applications in Natural Language Processing (NLP). Our system combines convolution and recurrent neural networks to measure the semantic similarity of sentences. It uses a convolution…

Computation and Language · Computer Science 2018-10-26 Elvys Linhares Pontes , Stéphane Huet , Andréa Carneiro Linhares , Juan-Manuel Torres-Moreno

TAN-NTM: Topic Attention Networks for Neural Topic Modeling

Topic models have been widely used to learn text representations and gain insight into document corpora. To perform topic discovery, most existing neural models either take document bag-of-words (BoW) or sequence of tokens as input followed…

Computation and Language · Computer Science 2021-07-12 Madhur Panwar , Shashank Shailabh , Milan Aggarwal , Balaji Krishnamurthy

Neural Transition-based Syntactic Linearization

The task of linearization is to find a grammatical order given a set of words. Traditional models use statistical methods. Syntactic linearization systems, which generate a sentence along with its syntactic tree, have shown state-of-the-art…

Computation and Language · Computer Science 2018-10-24 Linfeng Song , Yue Zhang , Daniel Gildea

Multivariate Gaussian Topic Modelling: A novel approach to discover topics with greater semantic coherence

An important aspect of text mining involves information retrieval in form of discovery of semantic themes (topics) from documents using topic modelling. While generative topic models like Latent Dirichlet Allocation (LDA) or Latent Semantic…

Machine Learning · Computer Science 2025-11-04 Satyajeet Sahoo , Jhareswar Maiti

Targeted Syntactic Evaluation of Language Models

We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an…

Computation and Language · Computer Science 2018-08-29 Rebecca Marvin , Tal Linzen

Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

Topic models have been the prominent tools for automatic topic discovery from text corpora. Despite their effectiveness, topic models suffer from several limitations including the inability of modeling word ordering information in…

Computation and Language · Computer Science 2022-02-10 Yu Meng , Yunyi Zhang , Jiaxin Huang , Yu Zhang , Jiawei Han

Nonparametric Relational Topic Models through Dependent Gamma Processes

Traditional Relational Topic Models provide a way to discover the hidden topics from a document network. Many theoretical and practical tasks, such as dimensional reduction, document clustering, link prediction, benefit from this revealed…

Machine Learning · Statistics 2015-03-31 Junyu Xuan , Jie Lu , Guangquan Zhang , Richard Yi Da Xu , Xiangfeng Luo

Overestimation of Syntactic Representationin Neural Language Models

With the advent of powerful neural language models over the last few years, research attention has increasingly focused on what aspects of language they represent that make them so successful. Several testing methodologies have been…

Computation and Language · Computer Science 2023-05-26 Jordan Kodner , Nitish Gupta

A Novel Document Generation Process for Topic Detection based on Hierarchical Latent Tree Models

We propose a novel document generation process based on hierarchical latent tree models (HLTMs) learned from data. An HLTM has a layer of observed word variables at the bottom and multiple layers of latent variables on top. For each…

Computation and Language · Computer Science 2019-07-01 Peixian Chen , Zhourong Chen , Nevin L. Zhang

On a Topic Model for Sentences

Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text…

Computation and Language · Computer Science 2016-06-02 Georgios Balikas , Massih-Reza Amini , Marianne Clausel

Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation

Syntactic structures used to play a vital role in natural language processing (NLP), but since the deep learning revolution, NLP has been gradually dominated by neural models that do not consider syntactic structures in their design. One…

Computation and Language · Computer Science 2023-11-28 Haoyi Wu , Kewei Tu