Related papers: Backprop as Functor: A compositional perspective o…
In "Backprop as functor", the authors show that the fundamental elements of deep learning -- gradient descent and backpropagation -- can be conceptualized as a strong monoidal functor Para(Euc)$\to$Learn from the category of parameterized…
Simple function classes have emerged as toy problems to better understand in-context-learning in transformer-based architectures used for large language models. But previously proposed simple function classes like linear regression or…
Function regression/approximation is a fundamental application of machine learning. Neural networks (NNs) can be easily trained for function regression using a sufficient number of neurons and epochs. The forward-forward learning algorithm…
Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and…
Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In…
Supervised learning is classically formulated as training a model to minimize a fixed loss function over a fixed distribution, or task. However, an emerging paradigm instead views model training as extracting enough information from data so…
The back-propagation algorithm is the cornerstone of deep learning. Despite its importance, few variations of the algorithm have been attempted. This work presents an approach to discover new variations of the back-propagation equation. We…
While supervised object detection methods achieve impressive accuracy, they generalize poorly to images whose appearance significantly differs from the data they have been trained on. To address this in scenarios where annotating data is…
We propose proximal backpropagation (ProxProp) as a novel algorithm that takes implicit instead of explicit gradient steps to update the network parameters during neural network training. Our algorithm is motivated by the step size…
This paper provides a unifying view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures. In particular, we show that generative…
Classification is a core topic in functional data analysis. A large number of functional classifiers have been proposed in the literature, most of which are based on functional principal component analysis or functional regression. In…
Representations are fundamental to artificial intelligence. The performance of a learning system depends on the type of representation used for representing the data. Typically, these representations are hand-engineered using domain…
In this paper we propose a novel gradient algorithm to learn a policy from an expert's observed behavior assuming that the expert behaves optimally with respect to some unknown reward function of a Markovian Decision Problem. The…
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters…
We adapt previous research on category theory and topological unsupervised learning to develop a functorial perspective on manifold learning, also known as nonlinear dimensionality reduction. We first characterize manifold learning…
Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic…
Conformal predictors are an important class of algorithms that allow predictions to be made with a user-defined confidence level. They are able to do this by outputting prediction sets, rather than simple point predictions. The conformal…
We develop a stochastic algorithm for independent component analysis that incorporates multi-trial supervision, which is available in many scientific contexts. The method blends a proximal gradient-type algorithm in the space of invertible…
Backpropagation is a classic automatic differentiation algorithm computing the gradient of functions specified by a certain class of simple, first-order programs, called computational graphs. It is a fundamental tool in several fields, most…
Gradient descent has been a central training principle for artificial neural networks from the early beginnings to today's deep learning networks. The most common implementation is the backpropagation algorithm for training feed-forward…