Related papers: Random Function Priors for Correlation Modeling

On the Predictive Power of Representation Dispersion in Language Models

We show that a language model's ability to predict text is tightly linked to the breadth of its embedding space: models that spread their contextual representations more widely tend to achieve lower perplexity. Concretely, we find that…

Computation and Language · Computer Science 2026-04-21 Yanhong Li , Ming Li , Karen Livescu , Jiawei Zhou

Predictive Complexity Priors

Specifying a Bayesian prior is notoriously difficult for complex models such as neural networks. Reasoning about parameters is made challenging by the high-dimensionality and over-parameterization of the space. Priors that seem benign and…

Machine Learning · Statistics 2020-10-22 Eric Nalisnick , Jonathan Gordon , José Miguel Hernández-Lobato

Normalized Latent Measure Factor Models

We propose a methodology for modeling and comparing probability distributions within a Bayesian nonparametric framework. Building on dependent normalized random measures, we consider a prior distribution for a collection of discrete random…

Methodology · Statistics 2022-06-01 Mario Beraha , Jim E. Griffin

Bayesian Nonparametric Covariance Regression

Although there is a rich literature on methods for allowing the variance in a univariate regression model to vary with predictors, time and other factors, relatively little has been done in the multivariate case. Our focus is on developing…

Methodology · Statistics 2015-03-17 Emily Fox , David Dunson

Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

Probabilistic neural networks are typically modeled with independent weight priors, which do not capture weight correlations in the prior and do not provide a parsimonious interface to express properties in function space. A desirable class…

Machine Learning · Statistics 2020-02-12 Theofanis Karaletsos , Thang D. Bui

Probabilistic Meta-Representations Of Neural Networks

Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in…

Machine Learning · Statistics 2018-10-02 Theofanis Karaletsos , Peter Dayan , Zoubin Ghahramani

Priors for symbolic regression

When choosing between competing symbolic models for a data set, a human will naturally prefer the "simpler" expression or the one which more closely resembles equations previously seen in a similar context. This suggests a non-uniform prior…

Machine Learning · Computer Science 2023-06-05 Deaglan J. Bartlett , Harry Desmond , Pedro G. Ferreira

Normalized random measures driven by increasing additive processes

This paper introduces and studies a new class of nonparametric prior distributions. Random probability distribution functions are constructed via normalization of random measures driven by increasing additive processes. In particular, we…

Statistics Theory · Mathematics 2007-06-13 Luis E. Nieto-Barajas , Igor Prunster , Stephen G. Walker

Estimation of embedding vectors in high dimensions

Embeddings are a basic initial feature extraction step in many machine learning models, particularly in natural language processing. An embedding attempts to map data tokens to a low-dimensional space where similar tokens are mapped to…

Machine Learning · Computer Science 2025-04-10 Golara Ahmadi Azar , Melika Emami , Alyson Fletcher , Sundeep Rangan

Random functions as data compressors for machine learning of molecular processes

Machine learning (ML) is rapidly transforming the way molecular dynamics simulations are performed and analyzed, from materials modeling to studies of protein folding and function. ML algorithms are often employed to learn low-dimensional…

Soft Condensed Matter · Physics 2025-09-23 Jayashrita Debnath , Gerhard Hummer

Prior shift estimation for positive unlabeled data through the lens of kernel embedding

We study estimation of a class prior for unlabeled target samples which possibly differs from that of source population. Moreover, it is assumed that the source data is partially observable: only samples from the positive class and from the…

Machine Learning · Statistics 2026-05-22 Jan Mielniczuk , Wojciech Rejchel , Paweł Teisseyre

Kernel Mean Embedding of Probability Measures and its Applications to Functional Data Analysis

This study intends to introduce kernel mean embedding of probability measures over infinite-dimensional separable Hilbert spaces induced by functional response statistical models. The embedded function represents the concentration of…

Statistics Theory · Mathematics 2020-11-05 Saeed Hayati , Kenji Fukumizu , Afshin Parvardeh

TzK Flow - Conditional Generative Model

We introduce TzK (pronounced "task"), a conditional probability flow-based model that exploits attributes (e.g., style, class membership, or other side information) in order to learn tight conditional prior around manifolds of the target…

Machine Learning · Computer Science 2019-02-21 Micha Livne , David J. Fleet

Probabilistic task modelling for meta-learning

We propose probabilistic task modelling -- a generative probabilistic model for collections of tasks used in meta-learning. The proposed model combines variational auto-encoding and latent Dirichlet allocation to model each task as a…

Machine Learning · Computer Science 2022-03-21 Cuong C. Nguyen , Thanh-Toan Do , Gustavo Carneiro

Non-Local Priors for High-Dimensional Estimation

Simultaneously achieving parsimony and good predictive power in high dimensions is a main challenge in statistics. Non-local priors (NLPs) possess appealing properties for high-dimensional model choice, but their use for estimation has not…

Statistics Theory · Mathematics 2015-01-22 David Rossell , Donatello Telesca

Word Embedding with Neural Probabilistic Prior

To improve word representation learning, we propose a probabilistic prior which can be seamlessly integrated with word embedding models. Different from previous methods, word embedding is taken as a probabilistic generative model, and it…

Computation and Language · Computer Science 2023-09-22 Shaogang Ren , Dingcheng Li , Ping Li

Mixture Density Network Estimation of Continuous Variable Maximum Likelihood Using Discrete Training Samples

Mixture Density Networks (MDNs) can be used to generate probability density functions of model parameters $\boldsymbol{\theta}$ given a set of observables $\mathbf{x}$. In some applications, training data are available only for discrete…

Data Analysis, Statistics and Probability · Physics 2021-08-18 Charles Burton , Spencer Stubbs , Peter Onyisi

PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks

Unsupervised text embedding methods, such as Skip-gram and Paragraph Vector, have been attracting increasing attention due to their simplicity, scalability, and effectiveness. However, comparing to sophisticated deep learning architectures…

Computation and Language · Computer Science 2015-08-04 Jian Tang , Meng Qu , Qiaozhu Mei

Morphological Priors for Probabilistic Neural Word Embeddings

Word embeddings allow natural language processing systems to share statistical information across related words. These embeddings are typically based on distributional statistics, making it difficult for them to generalize to rare or unseen…

Computation and Language · Computer Science 2016-09-27 Parminder Bhatia , Robert Guthrie , Jacob Eisenstein

On Learning Prediction-Focused Mixtures

Probabilistic models help us encode latent structures that both model the data and are ideally also useful for specific downstream tasks. Among these, mixture models and their time-series counterparts, hidden Markov models, identify…

Machine Learning · Computer Science 2021-10-29 Abhishek Sharma , Catherine Zeng , Sanjana Narayanan , Sonali Parbhoo , Finale Doshi-Velez