Related papers: Learners' Languages

Backprop as Functor: A compositional perspective on supervised learning

A supervised learning algorithm searches over a set of functions $A \to B$ parametrised by a space $P$ to find the best approximation to some ideal function $f\colon A \to B$. It does this by taking examples $(a,f(a)) \in A\times B$, and…

Category Theory · Mathematics 2019-05-02 Brendan Fong , David I. Spivak , Rémy Tuyéras

Deep learning for pedestrians: backpropagation in CNNs

The goal of this document is to provide a pedagogical introduction to the main concepts underpinning the training of deep neural networks using gradient descent; a process known as backpropagation. Although we focus on a very influential…

Machine Learning · Computer Science 2018-11-30 Laurent Boué

Deep Learning with Parametric Lenses

We propose a categorical semantics for machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of…

Machine Learning · Computer Science 2024-04-02 Geoffrey S. H. Cruttwell , Bruno Gavranovic , Neil Ghani , Paul Wilson , Fabio Zanasi

Lenses and Learners

Lenses are a well-established structure for modelling bidirectional transformations, such as the interactions between a database and a view of it. Lenses may be symmetric or asymmetric, and may be composed, forming the morphisms of a…

Machine Learning · Computer Science 2019-05-03 Brendan Fong , Michael Johnson

Fundamental Components of Deep Learning: A category-theoretic approach

Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and…

Machine Learning · Computer Science 2024-03-21 Bruno Gavranović

Deep Learning: A Bayesian Perspective

Deep learning is a form of machine learning for nonlinear high dimensional pattern matching and prediction. By taking a Bayesian probabilistic perspective, we provide a number of insights into more efficient algorithms for optimisation and…

Machine Learning · Statistics 2018-01-23 Nicholas Polson , Vadim Sokolov

Semantics, Representations and Grammars for Deep Learning

Deep learning is currently the subject of intensive study. However, fundamental concepts such as representations are not formally defined -- researchers "know them when they see them" -- and there is no common language for describing and…

Machine Learning · Computer Science 2015-09-30 David Balduzzi

Pulling Back the Curtain on Deep Networks

In linear models, visualizing a weight vector naturally reveals the model's preferred input direction, but extending this intuition to deep networks via gradients or gradient ascent often yields brittle or adversarial-looking features. We…

Machine Learning · Computer Science 2026-05-08 Maciej Satkiewicz , Roberto Corizzo , Marcin Pietroń

Categorical Invariants of Learning Dynamics

Neural network training is typically viewed as gradient descent on a loss surface. We propose a fundamentally different perspective: learning is a structure-preserving transformation (a functor L) between the space of network parameters…

Machine Learning · Computer Science 2025-10-07 Abdulrahman Tamim

Polynomial Regression as a Task for Understanding In-context Learning Through Finetuning and Alignment

Simple function classes have emerged as toy problems to better understand in-context-learning in transformer-based architectures used for large language models. But previously proposed simple function classes like linear regression or…

Machine Learning · Computer Science 2024-07-30 Max Wilcoxson , Morten Svendgård , Ria Doshi , Dylan Davis , Reya Vir , Anant Sahai

General supervised learning as change propagation with delta lenses

Delta lenses are an established mathematical framework for modelling and designing bidirectional model transformations. Following the recent observations by Fong et al, the paper extends the delta lens framework with a a new ingredient:…

Logic in Computer Science · Computer Science 2021-07-12 Zinovy Diskin

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community. Recent interpretability methods project weights and hidden states obtained from the forward pass to the…

Computation and Language · Computer Science 2024-02-21 Shahar Katz , Yonatan Belinkov , Mor Geva , Lior Wolf

Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning

Deep learning is also known as hierarchical learning, where the learner _learns_ to represent a complicated target function by decomposing it into a sequence of simpler functions to reduce sample and time complexity. This paper formally…

Machine Learning · Computer Science 2023-07-10 Zeyuan Allen-Zhu , Yuanzhi Li

Backpropagation in the Simply Typed Lambda-calculus with Linear Negation

Backpropagation is a classic automatic differentiation algorithm computing the gradient of functions specified by a certain class of simple, first-order programs, called computational graphs. It is a fundamental tool in several fields, most…

Logic in Computer Science · Computer Science 2019-11-07 Alois Brunel , Damiano Mazza , Michele Pagani

An Algorithm for Training Polynomial Networks

We consider deep neural networks, in which the output of each node is a quadratic function of its inputs. Similar to other deep architectures, these networks can compactly represent any function on a finite training set. The main goal of…

Machine Learning · Computer Science 2014-02-21 Roi Livni , Shai Shalev-Shwartz , Ohad Shamir

Challenge of Spatial Cognition for Deep Learning

Given the success of the deep convolutional neural networks (DCNNs) in applications of visual recognition and classification, it would be tantalizing to test if DCNNs can also learn spatial concepts, such as straightness, convexity,…

Computer Vision and Pattern Recognition · Computer Science 2020-05-13 Xi Zhang , Xiaolin Wu , Jun Du

A Survey on State-of-the-art Deep Learning Applications and Challenges

Deep learning, a branch of artificial intelligence, is a data-driven method that uses multiple layers of interconnected units or neurons to learn intricate patterns and representations directly from raw input data. Empowered by this…

Machine Learning · Computer Science 2025-07-28 Mohd Halim Mohd Noor , Ayokunle Olalekan Ige

Deep Learning and Geometric Deep Learning: an introduction for mathematicians and physicists

In this expository paper we want to give a brief introduction, with few key references for further reading, to the inner functioning of the new and successfull algorithms of Deep Learning and Geometric Deep Learning with a focus on Graph…

Machine Learning · Computer Science 2023-05-10 R. Fioresi , F. Zanchetta

Deep Learning: An Introduction for Applied Mathematicians

Multilayered artificial neural networks are becoming a pervasive tool in a host of application fields. At the heart of this deep learning revolution are familiar concepts from applied and computational mathematics; notably, in calculus,…

History and Overview · Mathematics 2018-01-19 Catherine F. Higham , Desmond J. Higham

Categorical Foundations of Gradient-Based Learning

We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it…

Machine Learning · Computer Science 2021-07-14 G. S. H. Cruttwell , Bruno Gavranović , Neil Ghani , Paul Wilson , Fabio Zanasi