Related papers: Parameter Learning for Log-supermodular Distributi…
We show that the log-likelihood of several probabilistic graphical models is Lipschitz continuous with respect to the lp-norm of the parameters. We discuss several implications of Lipschitz parametrization. We present an upper bound of the…
We consider the structured-output prediction problem through probabilistic approaches and generalize the "perturb-and-MAP" framework to more challenging weighted Hamming losses, which are crucial in applications. While in principle our…
Standard variational lower bounds used to train latent variable models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models…
Energy-based models (EBMs) offer a flexible framework for parameterizing probability distributions using neural networks. However, learning EBMs by exact maximum likelihood estimation (MLE) is generally intractable, due to the need to…
Scaling model capacity has been vital in the success of deep learning. For a typical network, necessary compute resources and training time grow dramatically with model size. Conditional computation is a promising way to increase the number…
Likelihood based-learning of graphical models faces challenges of computational-complexity and robustness to model mis-specification. This paper studies methods that fit parameters directly to maximize a measure of the accuracy of predicted…
This paper addresses two central problems for probabilistic processing models: parameter estimation from incomplete data and efficient retrieval of most probable analyses. These questions have been answered satisfactorily only for…
Submodular functions describe a variety of discrete problems in machine learning, signal processing, and computer vision. However, minimizing submodular functions poses a number of algorithmic challenges. Recent work introduced an…
Many problems in machine learning are naturally expressed in the language of undirected graphical models. Here, we propose black-box learning and inference algorithms for undirected models that optimize a variational approximation to the…
Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In…
We consider binary classification restricted to a class of continuous piecewise linear functions whose decision boundaries are (possibly nonconvex) starshaped polyhedral sets, supported on a fixed polyhedral simplicial fan. We investigate…
Bounds on the log partition function are important in a variety of contexts, including approximate inference, model fitting, decision theory, and large deviations analysis. We introduce a new class of upper bounds on the log partition…
This paper addresses the problem of sequential submodular maximization: selecting and ranking items in a sequence to optimize some composite submodular function. In contrast to most of the previous works, which assume access to the utility…
Increasing effort is put into the development of methods for learning mechanistic models from data. This task entails not only the accurate estimation of parameters but also a suitable model structure. Recent work on the discovery of…
Probabilistic modeling enables combining domain knowledge with learning from data, thereby supporting learning from fewer training instances than purely data-driven methods. However, learning probabilistic models is difficult and has not…
When classical particle filtering algorithms are used for maximum likelihood parameter estimation in nonlinear state-space models, a key challenge is that estimates of the likelihood function and its derivatives are inherently noisy. The…
Considering a probability distribution over parameters is known as an efficient strategy to learn a neural network with non-differentiable activation functions. We study the expectation of a probabilistic neural network as a predictor by…
A multiplicative stochastic process with the lower bound lognormally distributed is investigated. For the process, the model is constructed, and its distribution function (involving four parameters) and the related statistical properties…
Estimating the parameters of a probabilistic directed graphical model from incomplete data is a long-standing challenge. This is because, in the presence of latent variables, both the likelihood function and posterior distribution are…
Sudderth, Wainwright, and Willsky have conjectured that the Bethe approximation corresponding to any fixed point of the belief propagation algorithm over an attractive, pairwise binary graphical model provides a lower bound on the true…