Related papers: DiffETM: Diffusion Process Enhanced Embedded Topic…
Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic…
Topic modeling analyzes documents to learn meaningful patterns of words. For documents collected in sequence, dynamic topic models capture how these patterns vary over time. We develop the dynamic embedded topic model (D-ETM), a generative…
Topic modeling is a key component in unsupervised learning, employed to identify topics within a corpus of textual data. The rapid growth of social media generates an ever-growing volume of textual data daily, making online topic modeling…
By illuminating latent structures in a corpus of text, topic models are an essential tool for categorizing, summarizing, and exploring large collections of documents. Probabilistic topic models, such as latent Dirichlet allocation (LDA),…
Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such…
Topic models are widely used to analyze document collections. While they are valuable for discovering latent topics in a corpus when analysts are unfamiliar with the corpus, analysts also commonly start with an understanding of the content…
Diffusion models have shown remarkable performance in generation problems over various domains including images, videos, text, and audio. A practical bottleneck of diffusion models is their sampling speed, due to the repeated evaluation of…
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling. Although embedded topic models (ETMs) and its variants have gained promising performance in text analysis, they mainly focus…
Diffusion models have gained attention in speech enhancement tasks, providing an alternative to conventional discriminative methods. However, research on target speech extraction under multi-speaker noisy conditions remains relatively…
We introduce Efficient Motion Diffusion Model (EMDM) for fast and high-quality human motion generation. Current state-of-the-art generative diffusion models have produced impressive results but struggle to achieve fast generation without…
Diffusion models have emerged as powerful generative priors for high-dimensional inverse problems, yet learning them when only corrupted or noisy observations are available remains challenging. In this work, we propose a new method for…
Online topic models are unsupervised algorithms to identify latent topics in data streams that continuously evolve over time. Although these methods naturally align with real-world scenarios, they have received considerably less attention…
Transformer-based Diffusion Probabilistic Models (DPMs) have shown more potential than CNN-based DPMs, yet their extensive computational requirements hinder widespread practical applications. To reduce the computation budget of…
The rare-event sampling problem has long been the central limiting factor in molecular dynamics (MD), especially in biomolecular simulation. Recently, diffusion models such as BioEmu have emerged as powerful equilibrium samplers that…
A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document…
Diffusion models excel in solving imaging inverse problems due to their ability to model complex image priors. However, their reliance on large, clean datasets for training limits their practical use where clean data is scarce. In this…
Can continuous diffusion models bring the same performance breakthrough on natural language they did for image generation? To circumvent the discrete nature of text data, we can simply project tokens in a continuous space of embeddings, as…
Diffusion models recently proved to be remarkable priors for Bayesian inverse problems. However, training these models typically requires access to large amounts of clean data, which could prove difficult in some settings. In this work, we…
In this paper, we address the performance degradation of efficient diffusion models by introducing Multi-architecturE Multi-Expert diffusion models (MEME). We identify the need for tailored operations at different time-steps in diffusion…
Diffusion Models are probabilistic models that create realistic samples by simulating the diffusion process, gradually adding and removing noise from data. These models have gained popularity in domains such as image processing, speech…