Related papers: Beyond Transformers for Function Learning

A Self-Supervised Framework for Function Learning and Extrapolation

Understanding how agents learn to generalize -- and, in particular, to extrapolate -- in high-dimensional, naturalistic environments remains a challenge for both machine learning and the study of biological agents. One approach to this has…

Machine Learning · Computer Science 2021-06-15 Simon N. Segert , Jonathan D. Cohen

Inductive Biases for Deep Learning of Higher-Level Cognition

A fascinating hypothesis is that human and animal intelligence could be explained by a few principles (rather than an encyclopedic list of heuristics). If that hypothesis was correct, we could more easily both understand our own…

Machine Learning · Computer Science 2022-08-02 Anirudh Goyal , Yoshua Bengio

Beyond Interpolation: Extrapolative Reasoning with Reinforcement Learning and Graph Neural Networks

Despite incredible progress, many neural architectures fail to properly generalize beyond their training distribution. As such, learning to reason in a correct and generalizable way is one of the current fundamental challenges in machine…

Machine Learning · Computer Science 2025-02-10 Niccolò Grillo , Andrea Toccaceli , Joël Mathys , Benjamin Estermann , Stefania Fresca , Roger Wattenhofer

A Reservoir Model of Explicit Human Intelligence

A fundamental feature of human intelligence is that we accumulate and transfer knowledge as a society and across generations. We describe here a network architecture for the human brain that may support this feature and suggest that two key…

Neurons and Cognition · Quantitative Biology 2022-07-19 Eric C. Wong

The neural and cognitive architecture for learning from a small sample

Artificial intelligence algorithms are capable of fantastic exploits, yet they are still grossly inefficient compared with the brain's ability to learn from few exemplars or solve problems that have not been explicitly defined. What is the…

Neurons and Cognition · Quantitative Biology 2018-10-08 Aurelio Cortese , Benedetto De Martino , Mitsuo Kawato

Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines

Strong inductive biases give humans the ability to quickly learn to perform a variety of tasks. Although meta-learning is a method to endow neural networks with useful inductive biases, agents trained by meta-learning may sometimes acquire…

Artificial Intelligence · Computer Science 2023-02-07 Sreejan Kumar , Carlos G. Correa , Ishita Dasgupta , Raja Marjieh , Michael Y. Hu , Robert D. Hawkins , Nathaniel D. Daw , Jonathan D. Cohen , Karthik Narasimhan , Thomas L. Griffiths

Learning Representations that Support Extrapolation

Extrapolation -- the ability to make inferences that go beyond the scope of one's experiences -- is a hallmark of human intelligence. By contrast, the generalization exhibited by contemporary neural network algorithms is largely limited to…

Computer Vision and Pattern Recognition · Computer Science 2023-09-08 Taylor W. Webb , Zachary Dulberg , Steven M. Frankland , Alexander A. Petrov , Randall C. O'Reilly , Jonathan D. Cohen

On the Bias Against Inductive Biases

Borrowing from the transformer models that revolutionized the field of natural language processing, self-supervised feature learning for visual tasks has also seen state-of-the-art success using these extremely deep, isotropic networks.…

Computer Vision and Pattern Recognition · Computer Science 2021-06-01 George Cazenavette , Simon Lucey

Why are Sensitive Functions Hard for Transformers?

Empirical studies have identified a range of learnability biases and limitations of transformers, such as a persistent difficulty in learning to compute simple formal languages such as PARITY, and a bias towards low-degree functions.…

Machine Learning · Computer Science 2024-05-28 Michael Hahn , Mark Rofin

Applications of the Transformer Architecture in AI-Assisted English Reading Comprehension

This paper studies interpretable and fair artificial intelligence architectures for understanding English reading. Introduced transformer-based models, integrating advanced attention mechanisms and gradient-based feature attribution. The…

Computation and Language · Computer Science 2026-04-28 Ping Li

Explicit Inductive Bias for Transfer Learning with Convolutional Networks

In inductive transfer learning, fine-tuning pre-trained convolutional networks substantially outperforms training from scratch. When using fine-tuning, the underlying assumption is that the pre-trained model extracts generic features, which…

Machine Learning · Computer Science 2018-06-07 Xuhong Li , Yves Grandvalet , Franck Davoine

Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations

Models need appropriate inductive biases to effectively learn from small amounts of data and generalize systematically outside of the training distribution. While Transformers are highly versatile and powerful, they can still benefit from…

Computation and Language · Computer Science 2024-07-08 Matthias Lindemann , Alexander Koller , Ivan Titov

Learning to Extrapolate: A Transductive Approach

Machine learning systems, especially with overparameterized deep neural networks, can generalize to novel test instances drawn from the same distribution as the training data. However, they fare poorly when evaluated on out-of-support test…

Machine Learning · Computer Science 2023-04-28 Aviv Netanyahu , Abhishek Gupta , Max Simchowitz , Kaiqing Zhang , Pulkit Agrawal

Learning Inductive Biases with Simple Neural Networks

People use rich prior knowledge about the world in order to efficiently learn new concepts. These priors - also known as "inductive biases" - pertain to the space of internal models considered by a learner, and they help the learner make…

Computation and Language · Computer Science 2018-06-20 Reuben Feinman , Brenden M. Lake

Convolutional Bipartite Attractor Networks

In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence. The problem of interpretation is well…

Machine Learning · Computer Science 2019-09-30 Michael Iuzzolino , Yoram Singer , Michael C. Mozer

From Kepler to Newton: Inductive Biases Guide Learned World Models in Transformers

Can general-purpose AI architectures go beyond prediction to discover the physical laws governing the universe? True intelligence relies on "world models" -- causal abstractions that allow an agent to not only predict future states but…

Machine Learning · Computer Science 2026-02-09 Ziming Liu , Sophia Sanborn , Surya Ganguli , Andreas Tolias

Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers

An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is…

Machine Learning · Statistics 2024-04-16 Awni Altabaa , Taylor Webb , Jonathan Cohen , John Lafferty

Relational inductive bias for physical construction in humans and machines

While current deep learning systems excel at tasks such as object classification, language processing, and gameplay, few can construct or modify a complex system such as a tower of blocks. We hypothesize that what these systems lack is a…

Machine Learning · Computer Science 2018-06-05 Jessica B. Hamrick , Kelsey R. Allen , Victor Bapst , Tina Zhu , Kevin R. McKee , Joshua B. Tenenbaum , Peter W. Battaglia

Regulating the information in spikes: a useful bias

The bias/variance tradeoff is fundamental to learning: increasing a model's complexity can improve its fit on training data, but potentially worsens performance on future samples. Remarkably, however, the human brain effortlessly handles a…

Neurons and Cognition · Quantitative Biology 2012-10-18 David Balduzzi

An Introduction to Transformers

The transformer is a neural network component that can be used to learn useful representations of sequences or sets of data-points. The transformer has driven recent advances in natural language processing, computer vision, and…

Machine Learning · Computer Science 2026-01-21 Richard E. Turner