Christopher Mattern
It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful…
Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling general problem solving. But, what are the…
In this work we consider a new family of algorithms for sequential prediction, Hierarchical Partitioning Forecasters (HPFs). Our goal is to provide appealing theoretical - regret guarantees on a powerful model class - and practical -…
This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism;…
In this work we consider a generalized version of Probability Smoothing, the core elementary model for sequential prediction in the state of the art PAQ family of data compression algorithms. Our main contribution is a code length analysis…
This paper describes a family of probabilistic architectures designed for online learning under the logarithmic loss. Rather than relying on non-linear transfer functions, our method gains representational power by the use of data…
Probability estimation is essential for every statistical data compression algorithm. In practice probability estimation should be adaptive, recent observations should receive a higher weight than older observations. We present a…
Probability estimation is an elementary building block of every statistical data compression algorithm. In practice probability estimation is often based on relative letter frequencies which get scaled down, when their sum is too large.…
In this paper an approach to modelling nonstationary binary sequences, i.e., predicting the probability of upcoming symbols, is presented. After studying the prediction model we evaluate its performance in two non-artificial test cases.…
We propose geometric weighting as a novel method to combine multiple models in data compression. Our results reveal the rationale behind PAQ-weighting and generalize it to a non-binary alphabet. Based on a similar technique we present a…
Linear and geometric mixtures are two methods to combine arbitrary models in data compression. Geometric mixtures generalize the empirically well-performing PAQ7 mixture. Both mixture schemes rely on weight vectors, which heavily determine…