Christopher Mattern

Language Modeling Is Compression

It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful…

Machine Learning · Computer Science 2024-03-20 Grégoire Delétang , Anian Ruoss , Paul-Ambroise Duquenne , Elliot Catt , Tim Genewein , Christopher Mattern , Jordi Grau-Moya , Li Kevin Wenliang , Matthew Aitchison , Laurent Orseau , Marcus Hutter , Joel Veness

Learning Universal Predictors

Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling general problem solving. But, what are the…

Machine Learning · Computer Science 2024-01-29 Jordi Grau-Moya , Tim Genewein , Marcus Hutter , Laurent Orseau , Grégoire Delétang , Elliot Catt , Anian Ruoss , Li Kevin Wenliang , Christopher Mattern , Matthew Aitchison , Joel Veness

Hierarchical Partitioning Forecaster

In this work we consider a new family of algorithms for sequential prediction, Hierarchical Partitioning Forecasters (HPFs). Our goal is to provide appealing theoretical - regret guarantees on a powerful model class - and practical -…

Machine Learning · Computer Science 2023-05-23 Christopher Mattern

Gated Linear Networks

This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism;…

Machine Learning · Computer Science 2020-06-12 Joel Veness , Tor Lattimore , David Budden , Avishkar Bhoopchand , Christopher Mattern , Agnieszka Grabska-Barwinska , Eren Sezener , Jianan Wang , Peter Toth , Simon Schmitt , Marcus Hutter

Generalized Probability Smoothing

In this work we consider a generalized version of Probability Smoothing, the core elementary model for sequential prediction in the state of the art PAQ family of data compression algorithms. Our main contribution is a code length analysis…

Information Theory · Computer Science 2018-01-11 Christopher Mattern

Online Learning with Gated Linear Networks

This paper describes a family of probabilistic architectures designed for online learning under the logarithmic loss. Rather than relying on non-linear transfer functions, our method gains representational power by the use of data…

Machine Learning · Computer Science 2017-12-07 Joel Veness , Tor Lattimore , Avishkar Bhoopchand , Agnieszka Grabska-Barwinska , Christopher Mattern , Peter Toth

On Probability Estimation by Exponential Smoothing

Probability estimation is essential for every statistical data compression algorithm. In practice probability estimation should be adaptive, recent observations should receive a higher weight than older observations. We present a…

Information Theory · Computer Science 2015-01-12 Christopher Mattern

On Probability Estimation via Relative Frequencies and Discount

Probability estimation is an elementary building block of every statistical data compression algorithm. In practice probability estimation is often based on relative letter frequencies which get scaled down, when their sum is too large.…

Information Theory · Computer Science 2015-01-12 Christopher Mattern

Combining non-stationary prediction, optimization and mixing for data compression

In this paper an approach to modelling nonstationary binary sequences, i.e., predicting the probability of upcoming symbols, is presented. After studying the prediction model we evaluate its performance in two non-artificial test cases.…

Information Theory · Computer Science 2013-02-13 Christopher Mattern

Mixing Strategies in Data Compression

We propose geometric weighting as a novel method to combine multiple models in data compression. Our results reveal the rationale behind PAQ-weighting and generalize it to a non-binary alphabet. Based on a similar technique we present a…

Information Theory · Computer Science 2013-02-13 Christopher Mattern

Linear and Geometric Mixtures - Analysis

Linear and geometric mixtures are two methods to combine arbitrary models in data compression. Geometric mixtures generalize the empirically well-performing PAQ7 mixture. Both mixture schemes rely on weight vectors, which heavily determine…

Information Theory · Computer Science 2013-02-13 Christopher Mattern