Related papers: DMAP: A Distribution Map for Text

Large Language Models: A Mathematical Formulation

Large language models (LLMs) process and predict sequences containing text to answer questions, and address tasks including document summarization, providing recommendations, writing software and solving quantitative problems. We provide a…

Numerical Analysis · Mathematics 2026-02-02 Ricardo Baptista , Andrew Stuart , Son Tran

Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions

Language models (LMs) are machine learning models designed to predict linguistic patterns by estimating the probability of word sequences based on large-scale datasets, such as text. LMs have a wide range of applications in natural language…

Computation and Language · Computer Science 2025-03-24 Hadi Amini , Md Jueal Mia , Yasaman Saadati , Ahmed Imteaj , Seyedsina Nabavirazavi , Urmish Thakker , Md Zarif Hossain , Awal Ahmed Fime , S. S. Iyengar

Large Language Models: An Applied Econometric Framework

Large language models (LLMs) enable researchers to analyze text at unprecedented scale and minimal cost. Researchers can now revisit old questions and tackle novel ones with rich data. We provide an econometric framework for realizing this…

Econometrics · Economics 2025-12-08 Jens Ludwig , Sendhil Mullainathan , Ashesh Rambachan

How to use LLMs for Text Analysis

This guide introduces Large Language Models (LLM) as a highly versatile text analysis method within the social sciences. As LLMs are easy-to-use, cheap, fast, and applicable on a broad range of text analysis tasks, ranging from text…

Computation and Language · Computer Science 2023-07-26 Petter Törnberg

LML-DAP: Language Model Learning a Dataset for Data-Augmented Prediction

Classification tasks are typically handled using Machine Learning (ML) models, which lack a balance between accuracy and interpretability. This paper introduces a new approach for classification tasks using Large Language Models (LLMs) in…

Computation and Language · Computer Science 2025-01-03 Praneeth Vadlapati

LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language

Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses. Moreover, the expertise needed…

Machine Learning · Statistics 2024-12-23 James Requeima , John Bronskill , Dami Choi , Richard E. Turner , David Duvenaud

Exploration of Masked and Causal Language Modelling for Text Generation

Large Language Models (LLMs) have revolutionised the field of Natural Language Processing (NLP) and have achieved state-of-the-art performance in practically every task in this field. However, the prevalent approach used in text generation,…

Computation and Language · Computer Science 2024-08-12 Nicolo Micheletti , Samuel Belkadi , Lifeng Han , Goran Nenadic

Adaptable and Reliable Text Classification using Large Language Models

Text classification is fundamental in Natural Language Processing (NLP), and the advent of Large Language Models (LLMs) has revolutionized the field. This paper introduces an adaptable and reliable text classification paradigm, which…

Computation and Language · Computer Science 2024-12-10 Zhiqiang Wang , Yiran Pang , Yanbin Lin , Xingquan Zhu

Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations

Large-scale surveys are essential tools for informing social science research and policy, but running surveys is costly and time-intensive. If we could accurately simulate group-level survey results, this would therefore be very valuable to…

Computation and Language · Computer Science 2025-02-20 Yong Cao , Haijiang Liu , Arnav Arora , Isabelle Augenstein , Paul Röttger , Daniel Hershcovich

LLM Generated Distribution-Based Prediction of US Electoral Results, Part I

This paper introduces distribution-based prediction, a novel approach to using Large Language Models (LLMs) as predictive tools by interpreting output token probabilities as distributions representing the models' learned representation of…

Artificial Intelligence · Computer Science 2024-11-07 Caleb Bradshaw , Caelen Miller , Sean Warnick

LaMPP: Language Models as Probabilistic Priors for Perception and Action

Language models trained on large text corpora encode rich distributional information about real-world environments and action sequences. This information plays a crucial role in current approaches to language processing tasks like question…

Machine Learning · Computer Science 2023-02-07 Belinda Z. Li , William Chen , Pratyusha Sharma , Jacob Andreas

Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models

Large Language Models (LLMs) have achieved state-of-the-art performance on a broad range of Natural Language Processing (NLP) tasks, including document processing and code generation. Autoregressive Language Models (ARMs), which generate…

Machine Learning · Computer Science 2025-12-16 Minseo Kim , Coleman Hooper , Aditya Tomar , Chenfeng Xu , Mehrdad Farajtabar , Michael W. Mahoney , Kurt Keutzer , Amir Gholami

What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

Language models (LM) are capable of remarkably complex linguistic tasks; however, numerical reasoning is an area in which they frequently struggle. An important but rarely evaluated form of reasoning is understanding probability…

Computation and Language · Computer Science 2024-10-01 Akshay Paruchuri , Jake Garrison , Shun Liao , John Hernandez , Jacob Sunshine , Tim Althoff , Xin Liu , Daniel McDuff

Tag Map: A Text-Based Map for Spatial Reasoning and Navigation with Large Language Models

Large Language Models (LLM) have emerged as a tool for robots to generate task plans using common sense reasoning. For the LLM to generate actionable plans, scene context must be provided, often through a map. Recent works have shifted from…

Robotics · Computer Science 2024-09-25 Mike Zhang , Kaixian Qu , Vaishakh Patil , Cesar Cadena , Marco Hutter

Every Step Counts: Decoding Trajectories as Authorship Fingerprints of dLLMs

Discrete Diffusion Large Language Models (dLLMs) have recently emerged as a competitive paradigm for non-autoregressive language modeling. Their distinctive decoding mechanism enables faster inference speed and strong performance in code…

Computation and Language · Computer Science 2025-10-08 Qi Li , Runpeng Yu , Haiquan Lu , Xinchao Wang

Large Language Models For Text Classification: Case Study And Comprehensive Review

Unlocking the potential of Large Language Models (LLMs) in data classification represents a promising frontier in natural language processing. In this work, we evaluate the performance of different LLMs in comparison with state-of-the-art…

Computation and Language · Computer Science 2025-01-16 Arina Kostina , Marios D. Dikaiakos , Dimosthenis Stefanidis , George Pallis

A Note on Statistically Accurate Tabular Data Generation Using Large Language Models

Large language models (LLMs) have shown promise in synthetic tabular data generation, yet existing methods struggle to preserve complex feature dependencies, particularly among categorical variables. This work introduces a…

Machine Learning · Computer Science 2025-05-07 Andrey Sidorenko

Beyond the Black Box: A Statistical Model for LLM Reasoning and Inference

This paper introduces a novel Bayesian learning model to explain the behavior of Large Language Models (LLMs), focusing on their core optimization metric of next token prediction. We develop a theoretical framework based on an ideal…

Machine Learning · Computer Science 2024-09-25 Siddhartha Dalal , Vishal Misra

Do LLMs Play Dice? Exploring Probability Distribution Sampling in Large Language Models for Behavioral Simulation

With the rapid advancement of large language models (LLMs) for handling complex language tasks, an increasing number of studies are employing LLMs as agents to emulate the sequential decision-making processes of humans often represented as…

Computation and Language · Computer Science 2024-12-19 Jia Gu , Liang Pang , Huawei Shen , Xueqi Cheng

Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions

Large language models (LLMs) present novel opportunities in public opinion research by predicting survey responses in advance during the early stages of survey design. Prior methods steer LLMs via descriptions of subpopulations as LLMs'…

Computation and Language · Computer Science 2026-04-17 Joseph Suh , Erfan Jahanparast , Suhong Moon , Minwoo Kang , Serina Chang