Marcus Hutter
Frontier models can be prompted or conditioned to do many tasks, but finding good prompts is not always easy, nor is understanding some performant prompts. We view prompting as finding the best conditioning sequence on a near-optimal…
Spurious correlations are common in time-series analysis because simple, low-complexity patterns can produce high Pearson correlations even between unrelated series. We argue that Kolmogorov complexity, interpreted as resistance to…
We generalize the AIXI reinforcement learning agent to admit a wider class of utility functions. Assigning a utility to each possible interaction history forces us to confront the ambiguity that some hypotheses in the agent's belief…
The standard theory of model-free reinforcement learning assumes that the environment dynamics are stationary and that agents are decoupled from their environment, such that policies are treated as being separate from the world they…
Making accurate inferences about data is a key task in science and mathematics. Here we study the problem of \emph{retrodiction}, inferring past values of a series, in the context of chaotic dynamical systems. Specifically, we are…
Prompting is one of the main ways to adapt a pretrained model to target tasks. Besides manually constructing prompts, many prompt optimization methods have been proposed in the literature. Method development is mainly empirically driven,…
A Bayesian player acting in an infinite multi-player game learns to predict the other players' strategies if his prior assigns positive probability to their play (or contains a grain of truth). Kalai and Lehrer's classic grain of truth…
The domain-independent universal Normalized Information Distance based on Kolmogorov complexity has been (in approximate form) successfully applied to a variety of difficult clustering problems. In this paper we investigate theoretical…
We rigorously discuss the commonly asserted failures of the AIXI reinforcement learning agent as a model of embedded agency. We attempt to formalize these failure modes and prove that they occur within the framework of universal artificial…
Levin Tree Search (LTS) (Orseau et al., 2018) is a search algorithm for deterministic environments that uses a user-specified policy to guide the search. It comes with a formal guarantee on the number of search steps (node visits) for…
This paper considers a generalisation of universal source coding for interaction data, namely data streams that have actions interleaved with observations. Our goal will be to construct a coding distribution that is both universal…
G\'acs' coarse-grained algorithmic entropy leverages universal computation to quantify the information content of any given physical state. Unlike the Boltzmann and Gibbs-Shannon entropies, it requires no prior commitment to macrovariables…
Although much of the success of Deep Learning builds on learning good representations, a rigorous method to evaluate their quality is lacking. In this paper, we treat the evaluation of representations as a model selection problem and…
Levin Tree Search (LTS) is a search algorithm that makes use of a policy (a probability distribution over actions) and comes with a theoretical guarantee on the number of expansions before reaching a goal node, depending on the quality of…
We extend and combine several tools of the literature to design fast, adaptive, anytime and scale-free online learning algorithms. Scale-free regret bounds must scale linearly with the maximum loss, both toward large losses and toward very…
In reinforcement learning, if the agent's reward differs from the designers' true utility, even only rarely, the state distribution resulting from the agent's policy can be very bad, in theory and in practice. When RL policies would devolve…
Machine Learning (ML) and Algorithmic Information Theory (AIT) look at Complexity from different points of view. We explore the interface between AIT and Kernel Methods (that are prevalent in ML) by adopting an AIT perspective on the…
To understand the risks posed by a new AI system, we must understand what it can and cannot do. Building on prior work, we introduce a programme of new "dangerous capability" evaluations and pilot them on Gemini 1.0 models. Our evaluations…
It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful…
We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation. While it is generally known that this approach improves the overall predictive performance, especially when…