Related papers: Minimum Description Length Revisited
This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the…
We analyze differences between two information-theoretically motivated approaches to statistical inference and model selection: the Minimum Description Length (MDL) principle, and the Minimum Message Length (MML) principle. Based on this…
The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's…
In the signal processing and statistics literature, the minimum description length (MDL) principle is a popular tool for choosing model complexity. Successful examples include signal denoising and variable selection in linear regression,…
We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle. In this approach, which we term MDL-control (MDL-C), the agent learns the common structure among the tasks with which…
The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result…
This paper introduces a new method for model selection and more generally hyperparameter selection in machine learning. Minimum description length (MDL) is an established method for model selection, which is however not directly aimed at…
State-of-the-art neural networks can be trained to become remarkable solutions to many problems. But while these architectures can express symbolic, perfect solutions, trained models often arrive at approximations instead. We show that the…
Many regression problems involve not one but several response variables (y's). Often the responses are suspected to share a common underlying structure, in which case it may be advantageous to share information across them; this is known as…
We study the properties of the Minimum Description Length principle for sequence prediction, considering a two-part MDL estimator which is chosen from a countable class of models. This applies in particular to the important case of…
A major challenge in designing efficient statistical supervised learning algorithms is finding representations that perform well not only on available training samples but also on unseen data. While the study of representation learning has…
Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning non-i.i.d. processes by means of two-part MDL, where the underlying…
In previous work we developed a method of learning Bayesian Network models from raw data. This method relies on the well known minimal description length (MDL) principle. The MDL principle is particularly well suited to this task as it…
The minimum description length (MDL) principle in supervised learning is studied. One of the most important theories for the MDL principle is Barron and Cover's theory (BC theory), which gives a mathematical justification of the MDL…
To measure how well pretrained representations encode some linguistic property, it is common to use accuracy of a probe, i.e. a classifier trained to predict the property from the representations. Despite widespread adoption of probes,…
Deep neural networks trained through end-to-end learning have achieved remarkable success across various domains in the past decade. However, the end-to-end learning strategy, originally designed to minimize predictive loss in a black-box…
The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting…
Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel…
Over the years, ensemble methods have become a staple of machine learning. Similarly, generalized linear models (GLMs) have become very popular for a wide variety of statistical inference tasks. The former have been shown to enhance out-…
Complexity is a fundamental concept underlying statistical learning theory that aims to inform generalization performance. Parameter count, while successful in low-dimensional settings, is not well-justified for overparameterized settings…