Related papers: Parameter Expansion and Efficient Inference
Many applications require that we learn the parameters of a model from data. EM is a method used to learn the parameters of probabilistic models for which the data for some of the variables in the models is either missing or hidden. There…
The Expectation--Maximization (EM) algorithm is a simple meta-algorithm that has been used for many years as a methodology for statistical inference when there are missing measurements in the observed data or when the data is composed of…
We provide a general theory of the expectation-maximization (EM) algorithm for inferring high dimensional latent variable models. In particular, we make two contributions: (i) For parameter estimation, we propose a novel high dimensional EM…
Parameter estimation in logistic regression is a well-studied problem with the Newton-Raphson method being one of the most prominent optimization techniques used in practice. A number of monotone optimization methods including…
Parameter-shared pre-trained language models (PLMs) have emerged as a successful approach in resource-constrained environments, enabling substantial reductions in model storage and memory costs without significant performance compromise.…
Improved EM strategies, based on the idea of efficient data augmentation (Meng and van Dyk 1997, 1998), are presented for ML estimation of mixture proportions. The resulting algorithms inherit the simplicity, ease of implementation, and…
The rise of deep learning has marked significant progress in fields such as computer vision, natural language processing, and medical imaging, primarily through the adaptation of pre-trained models for specific tasks. Traditional…
The convergence of expectation-maximization (EM)-based algorithms typically requires continuity of the likelihood function with respect to all the unknown parameters (optimization variables). The requirement is not met when parameters…
We consider efficient estimation of flexible transformation models with interval-censored data. To reduce the dimension of semi-parametric models, the unknown monotone transformation function is approximated via monotone splines. A…
Abbreviation expansion is a strategy used to speed up communication by limiting the amount of typing and using a language model to suggest expansions. Here we look at personalizing a Large Language Model's (LLM) suggestions based on prior…
Although scaling up the number of trainable parameters in both pre-training and fine-tuning can effectively improve the performance of large language models, it also leads to increased computational overhead. When delving into the parameter…
Parameter-efficient fine-tuning (PEFT) techniques, such as adapter tuning, aim to fine-tune a pre-trained language model (PLM) using a minimal number of parameters for a specific task or profile. Although adapter tuning provides increased…
The Expectation Maximization (EM) algorithm is a key reference for inference in latent variable models; unfortunately, its computational cost is prohibitive in the large scale learning setting. In this paper, we propose an extension of the…
Expectation Maximization (EM) is among the most popular algorithms for estimating parameters of statistical models. However, EM, which is an iterative algorithm based on the maximum likelihood principle, is generally only guaranteed to find…
Statistical pragmatism embraces all efficient methods in statistical inference. Augmentation of the collected data is used herein to obtain representative population information from a large class of non-representative population's units.…
As fine-tuning becomes impractical at scale, probing is emerging as the preferred evaluation protocol. However, standard linear probing can understate the capability of models whose pre-training optimizes local representations rather than…
The aim of the paper is to examine the computational complexity and algorithmics of enumeration, the task to output all solutions of a given problem, from the point of view of parameterized complexity. First we define formally different…
The EM (Expectation-Maximization) algorithm is regarded as an MM (Majorization-Minimization) algorithm for maximum likelihood estimation of statistical models. Expanding this view, this paper demonstrates that by choosing an appropriate…
Bayesian variable selection methods are powerful techniques for fitting and inferring on sparse high-dimensional linear regression models. However, many are computationally intensive or require restrictive prior distributions on model…
Parameter-efficient tuning aims to mitigate the large memory requirements of adapting pretrained language models for downstream tasks. For example, one popular method, prefix-tuning, prepends trainable tokens to sequences while freezing the…