Related papers: Statistical Performance Analysis of MDL Source Enu…

Minimum Encoding Approaches for Predictive Modeling

We analyze differences between two information-theoretically motivated approaches to statistical inference and model selection: the Minimum Description Length (MDL) principle, and the Minimum Message Length (MML) principle. Based on this…

Machine Learning · Computer Science 2013-02-01 Peter D Grunwald , Petri Kontkanen , Petri Myllymaki , Tomi Silander , Henry Tirri

Differential Description Length for Hyperparameter Selection in Machine Learning

This paper introduces a new method for model selection and more generally hyperparameter selection in machine learning. Minimum description length (MDL) is an established method for model selection, which is however not directly aimed at…

Machine Learning · Computer Science 2019-05-23 Mojtaba Abolfazli , Anders Host-Madsen , June Zhang

The Minimum Description Length Principle for Pattern Mining: A Survey

This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the…

Databases · Computer Science 2022-07-29 Esther Galbrun

Extending the Use of MDL for High-Dimensional Problems: Variable Selection, Robust Fitting, and Additive Modeling

In the signal processing and statistics literature, the minimum description length (MDL) principle is a popular tool for choosing model complexity. Successful examples include signal denoising and variable selection in linear regression,…

Signal Processing · Electrical Eng. & Systems 2022-01-28 Zhenyu Wei , Raymond K. W. Wong , Thomas C. M. Lee

Minimum Description Length Revisited

This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL…

Methodology · Statistics 2019-12-19 Peter Grünwald , Teemu Roos

Low-rank data modeling via the Minimum Description Length principle

Robust low-rank matrix estimation is a topic of increasing interest, with promising applications in a variety of fields, from computer vision to data mining and recommender systems. Recent theoretical results establish the ability of such…

Information Theory · Computer Science 2011-09-29 Ignacio Ramírez , Guillermo Sapiro

An MDL framework for sparse coding and dictionary learning

The power of sparse signal modeling with learned over-complete dictionaries has been demonstrated in a variety of applications and fields, from signal processing to statistical inference and machine learning. However, the statistical…

Information Theory · Computer Science 2017-04-26 Ignacio Ramírez , Guillermo Sapiro

Sparse coding and dictionary learning based on the MDL principle

The power of sparse signal coding with learned dictionaries has been demonstrated in a variety of applications and fields, from signal processing to statistical inference and machine learning. However, the statistical properties of these…

Information Theory · Computer Science 2010-10-25 Ignacio Ramírez , Guillermo Sapiro

Minimum Description Length codes are critical

In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are…

Methodology · Statistics 2018-10-03 Ryan John Cubero , Matteo Marsili , Yasser Roudi

SiML: Sieved Maximum Likelihood for Array Signal Processing

Stochastic Maximum Likelihood (SML) is a popular direction of arrival (DOA) estimation technique in array signal processing. It is a parametric method that jointly estimates signal and instrument noise by maximum likelihood, achieving…

Applications · Statistics 2021-02-04 Matthieu Simeoni , Paul Hurley

An MDL-Based Classifier for Transactional Datasets with Application in Malware Detection

We design a classifier for transactional datasets with application in malware detection. We build the classifier based on the minimum description length (MDL) principle. This involves selecting a model that best compresses the training…

Machine Learning · Computer Science 2019-12-12 Behzad Asadi , Vijay Varadharajan

Efficient Encoding of Dynamical Systems through Local Approximations

An efficient representation of observed data has many benefits in various domains of engineering and science. Representing static data sets, such as images, is a living branch in machine learning and eases downstream tasks, such as…

Systems and Control · Computer Science 2018-09-28 Friedrich Solowjow , Arash Mehrjou , Bernhard Schölkopf , Sebastian Trimpe

Network Model Selection Using Task-Focused Minimum Description Length

Networks are fundamental models for data used in practically every application domain. In most instances, several implicit or explicit choices about the network definition impact the translation of underlying data to a network…

Artificial Intelligence · Computer Science 2018-01-12 Ivan Brugere , Tanya Y. Berger-Wolf

Exploring LLM Features in Predictive Process Monitoring for Small-Scale Event-Logs

Predictive Process Monitoring is a branch of process mining that aims to predict the outcome of an ongoing process. Recently, it leveraged machine-and-deep learning architectures. In this paper, we extend our prior LLM-based Predictive…

Artificial Intelligence · Computer Science 2026-01-19 Alessandro Padella , Massimiliano de Leoni , Marlon Dumas

A Minimum Description Length Approach to Regularization in Neural Networks

State-of-the-art neural networks can be trained to become remarkable solutions to many problems. But while these architectures can express symbolic, perfect solutions, trained models often arrive at approximations instead. We show that the…

Machine Learning · Computer Science 2025-09-09 Matan Abudy , Orr Well , Emmanuel Chemla , Roni Katzir , Nur Lan

Information-Theoretic Probing with Minimum Description Length

To measure how well pretrained representations encode some linguistic property, it is common to use accuracy of a probe, i.e. a classifier trained to predict the property from the representations. Despite widespread adoption of probes,…

Computation and Language · Computer Science 2020-03-30 Elena Voita , Ivan Titov

MDL-motivated compression of GLM ensembles increases interpretability and retains predictive power

Over the years, ensemble methods have become a staple of machine learning. Similarly, generalized linear models (GLMs) have become very popular for a wide variety of statistical inference tasks. The former have been shown to enhance out-…

Machine Learning · Statistics 2016-11-22 Boris Hayete , Matthew Valko , Alex Greenfield , Raymond Yan

Wideband Source Enumeration Using Sparse Array Periodogram Averaging in Low Snapshot Scenarios

This paper proposes a new sparse array source enumeration algorithm for underdetermined scenarios with more sources than sensors. The proposed algorithm decomposes the wideband signals into multiple uncorrelated frequency bands, computes…

Signal Processing · Electrical Eng. & Systems 2019-12-30 Yang Liu , John R. Buck

Evaluating LLM-Based Process Explanations under Progressive Behavioral-Input Reduction

Large Language Models (LLMs) are increasingly used to generate textual explanations of process models discovered from event logs. Producing explanations from large behavioral abstractions (e.g., directly-follows graphs or Petri nets) can be…

Machine Learning · Computer Science 2025-10-14 P. van Oerle , R. H. Bemthuis , F. A. Bukhsh

MDL Convergence Speed for Bernoulli Sequences

The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting…

Statistics Theory · Mathematics 2007-07-16 Jan Poland , Marcus Hutter