Related papers: Bethe Learning of Conditional Random Fields via MA…

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

Combining discrete probability distributions and combinatorial optimization problems with neural network components has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a…

Machine Learning · Computer Science 2021-10-28 Mathias Niepert , Pasquale Minervini , Luca Franceschi

Bethe Projections for Non-Local Inference

Many inference problems in structured prediction are naturally solved by augmenting a tractable dependency structure with complex, non-local auxiliary objectives. This includes the mean field family of variational inference algorithms,…

Machine Learning · Statistics 2016-11-29 Luke Vilnis , David Belanger , Daniel Sheldon , Andrew McCallum

Constrained Approximate Maximum Entropy Learning of Markov Random Fields

Parameter estimation in Markov random fields (MRFs) is a difficult task, in which inference over the network is run in the inner loop of a gradient descent procedure. Replacing exact inference with approximate methods such as loopy belief…

Machine Learning · Computer Science 2012-06-18 Varun Ganapathi , David Vickrey , John Duchi , Daphne Koller

Exponential Family Estimation via Adversarial Dynamics Embedding

We present an efficient algorithm for maximum likelihood estimation (MLE) of exponential family models, with a general parametrization of the energy function that includes neural networks. We exploit the primal-dual view of the MLE with a…

Machine Learning · Computer Science 2020-04-01 Bo Dai , Zhen Liu , Hanjun Dai , Niao He , Arthur Gretton , Le Song , Dale Schuurmans

Distributed Estimation, Information Loss and Exponential Families

Distributed learning of probabilistic models from multiple data repositories with minimum communication is increasingly important. We study a simple communication-efficient learning framework that first calculates the local maximum…

Machine Learning · Statistics 2014-10-13 Qiang Liu , Alexander Ihler

$k$-MLE: A fast algorithm for learning statistical mixture models

We describe $k$-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization…

Machine Learning · Computer Science 2016-11-15 Frank Nielsen

Probabilistic Structured Predictors

We consider MAP estimators for structured prediction with exponential family models. In particular, we concentrate on the case that efficient algorithms for uniform sampling from the output space exist. We show that under this assumption…

Machine Learning · Computer Science 2012-05-14 Shankar Vembu , Thomas Gartner , Mario Boley

Score Matched Neural Exponential Families for Likelihood-Free Inference

Bayesian Likelihood-Free Inference (LFI) approaches allow to obtain posterior distributions for stochastic models with intractable likelihood, by relying on model simulations. In Approximate Bayesian Computation (ABC), a popular LFI method,…

Methodology · Statistics 2022-02-08 Lorenzo Pacchiardi , Ritabrata Dutta

What Cannot be Learned with Bethe Approximations

We address the problem of learning the parameters in graphical models when inference is intractable. A common strategy in this case is to replace the partition function with its Bethe approximation. We show that there exists a regime of…

Machine Learning · Computer Science 2012-02-20 Uri Heinemann , Amir Globerson

Joint Learning of Energy-based Models and their Partition Function

Energy-based models (EBMs) offer a flexible framework for parameterizing probability distributions using neural networks. However, learning EBMs by exact maximum likelihood estimation (MLE) is generally intractable, due to the need to…

Machine Learning · Computer Science 2025-08-20 Michael E. Sander , Vincent Roulet , Tianlin Liu , Mathieu Blondel

Context-aware learning for generative models

This work studies the class of algorithms for learning with side-information that emerge by extending generative models with embedded context-related variables. Using finite mixture models (FMM) as the prototypical Bayesian network, we show…

Machine Learning · Statistics 2020-08-17 Serafeim Perdikis , Robert Leeb , Ricardo Chavarriaga , José del R. Millán

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

In this paper, we study representation learning in partially observable Markov Decision Processes (POMDPs), where the agent learns a decoder function that maps a series of high-dimensional raw observations to a compact representation and…

Machine Learning · Computer Science 2023-06-22 Jiacheng Guo , Zihao Li , Huazheng Wang , Mengdi Wang , Zhuoran Yang , Xuezhou Zhang

Learning a Loopy Model For Semantic Segmentation Exactly

Learning structured models using maximum margin techniques has become an indispensable tool for com- puter vision researchers, as many computer vision applications can be cast naturally as an image labeling problem. Pixel-based or…

Machine Learning · Computer Science 2013-09-17 Andreas Christian Mueller , Sven Behnke

Learning Energy-Based Model with Variational Auto-Encoder as Amortized Sampler

Due to the intractable partition function, training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo (MCMC) sampling to approximate the gradient of the Kullback-Leibler divergence between data and model…

Machine Learning · Statistics 2021-12-28 Jianwen Xie , Zilong Zheng , Ping Li

MAP inference via Block-Coordinate Frank-Wolfe Algorithm

We present a new proximal bundle method for Maximum-A-Posteriori (MAP) inference in structured energy minimization problems. The method optimizes a Lagrangean relaxation of the original energy minimization problem using a multi plane…

Machine Learning · Computer Science 2019-04-08 Paul Swoboda , Vladimir Kolmogorov

Marginally Parametrized Spatio-Temporal Models and Stepwise Maximum Likelihood Estimation

In order to learn the complex features of large spatio-temporal data, models with large parameter sets are often required. However, estimating a large number of parameters is often infeasible due to the computational and memory costs of…

Computation · Statistics 2018-07-02 Matthew Edwards , Stefano Castruccio , Dorit Hammerling

A Simple Algorithm for Scalable Monte Carlo Inference

The methods of statistical physics are widely used for modelling complex networks. Building on the recently proposed Equilibrium Expectation approach, we derive a simple and efficient algorithm for maximum likelihood estimation (MLE) of…

Computation · Statistics 2020-02-12 Alexander Borisenko , Maksym Byshkin , Alessandro Lomi

Meta-probabilistic Modeling

Probabilistic graphical models (PGMs) are widely used to discover latent structure in data, but their success hinges on selecting an appropriate model design. In practice, model specification is difficult and often requires iterative…

Machine Learning · Computer Science 2026-04-08 Kevin Zhang , Yixin Wang

Multi-Entity Dependence Learning with Rich Context via Conditional Variational Auto-encoder

Multi-Entity Dependence Learning (MEDL) explores conditional correlations among multiple entities. The availability of rich contextual information requires a nimble learning scheme that tightly integrates with deep neural networks and has…

Machine Learning · Computer Science 2017-09-19 Luming Tang , Yexiang Xue , Di Chen , Carla P. Gomes

A Joint MLE Approach to Large-Scale Structured Latent Attribute Analysis

Structured Latent Attribute Models (SLAMs) are a family of discrete latent variable models widely used in education, psychology, and epidemiology to model multivariate categorical data. A SLAM assumes that multiple discrete latent…

Methodology · Statistics 2021-07-12 Yuqi Gu , Gongjun Xu