Related papers: Discrete-Continuous Mixtures in Probabilistic Prog…
We introduce probabilistic language tries (PLTs), a unified representation that makes explicit the prefix structure implicitly defined by any generative model over sequences. By assigning to each outgoing edge the conditional probability of…
Representing token embeddings as probability distributions over learned manifolds allows for more flexible contextual inference, reducing representational rigidity while enhancing semantic granularity. Comparative evaluations demonstrate…
Probabilistic programming languages and modeling toolkits are two modular ways to build and reuse stochastic models and inference procedures. Combining strengths of both, we express models and inference as generalized coroutines in the same…
The rapid scaling of large language models~(LLMs) has made inference efficiency a primary bottleneck in the practical deployment. To address this, semi-structured sparsity offers a promising solution by strategically retaining $N$ elements…
Understanding how the structure of language can be learned from sentences alone is a central question in both cognitive science and machine learning. Studies of the internal representations of Large Language Models (LLMs) support their…
This paper introduces distribution-based prediction, a novel approach to using Large Language Models (LLMs) as predictive tools by interpreting output token probabilities as distributions representing the models' learned representation of…
We introduce a class of neural networks derived from probabilistic models in the form of Bayesian belief networks. By imposing additional assumptions about the nature of the probabilistic models represented in the belief networks, we derive…
The EM-algorithm is a general procedure to get maximum likelihood estimates if part of the observations on the variables of a network are missing. In this paper a stochastic version of the algorithm is adapted to probabilistic neural…
Word embedding algorithms produce very reliable feature representations of words that are used by neural network models across a constantly growing multitude of NLP tasks. As such, it is imperative for NLP practitioners to understand how…
Constraint Programming (CP) and Machine Learning (ML) face challenges in text generation due to CP's struggle with implementing "meaning'' and ML's difficulty with structural constraints. This paper proposes a solution by combining both…
Autoregressive Large Language Models (LLMs) trained for next-word prediction have demonstrated remarkable proficiency at producing coherent text. But are they equally adept at forming coherent probability judgments? We use probabilistic…
The performance of finetuned large language models (LLMs) hinges critically on the composition of the training mixture. However, selecting an optimal blend of task datasets remains a largely manual, heuristic driven process, with…
Generative Pretrained Transformers (GPTs) are foundational Large Language Models (LLMs) for text generation. However, individual LLMs often produce inconsistent outputs and exhibit biases, limiting their representation of diverse language…
We consider deep multivariate models for heterogeneous collections of random variables. In the context of computer vision, such collections may e.g. consist of images, segmentations, image attributes, and latent variables. When developing…
As multiple crises threaten the sustainability of our societies and pose at risk the planetary boundaries, complex challenges require timely, updated, and usable information. Natural-language processing (NLP) tools enhance and expand data…
Arguing for the need to combine declarative and probabilistic programming, B\'ar\'any et al. (TODS 2017) recently introduced a probabilistic extension of Datalog as a "purely declarative probabilistic programming language." We revisit this…
Large language models are increasingly used to predict human preferences in both scientific and business endeavors, yet current approaches rely exclusively on analyzing model outputs without considering the underlying mechanisms. Using…
Probabilistic programming languages (PPLs) are a popular tool for high-level modelling across many fields. They provide a range of algorithms for probabilistic inference, which analyse models by learning their parameters from a dataset or…
Bayesian Generalized Nonlinear Models (BGNLM) offer a flexible nonlinear alternative to GLM while still providing better interpretability than machine learning techniques such as neural networks. In BGNLM, the methods of Bayesian Variable…
Recent advances in Large Language Models (LLMs) have intensified the debate surrounding the fundamental nature of their reasoning capabilities. While achieving high performance on benchmarks such as GPQA and MMLU, these models exhibit…