English
Related papers

Related papers: Beyond Transformers for Function Learning

200 papers

Understanding how agents learn to generalize -- and, in particular, to extrapolate -- in high-dimensional, naturalistic environments remains a challenge for both machine learning and the study of biological agents. One approach to this has…

Machine Learning · Computer Science 2021-06-15 Simon N. Segert , Jonathan D. Cohen

A fascinating hypothesis is that human and animal intelligence could be explained by a few principles (rather than an encyclopedic list of heuristics). If that hypothesis was correct, we could more easily both understand our own…

Machine Learning · Computer Science 2022-08-02 Anirudh Goyal , Yoshua Bengio

Despite incredible progress, many neural architectures fail to properly generalize beyond their training distribution. As such, learning to reason in a correct and generalizable way is one of the current fundamental challenges in machine…

Machine Learning · Computer Science 2025-02-10 Niccolò Grillo , Andrea Toccaceli , Joël Mathys , Benjamin Estermann , Stefania Fresca , Roger Wattenhofer

A fundamental feature of human intelligence is that we accumulate and transfer knowledge as a society and across generations. We describe here a network architecture for the human brain that may support this feature and suggest that two key…

Neurons and Cognition · Quantitative Biology 2022-07-19 Eric C. Wong

Artificial intelligence algorithms are capable of fantastic exploits, yet they are still grossly inefficient compared with the brain's ability to learn from few exemplars or solve problems that have not been explicitly defined. What is the…

Neurons and Cognition · Quantitative Biology 2018-10-08 Aurelio Cortese , Benedetto De Martino , Mitsuo Kawato

Strong inductive biases give humans the ability to quickly learn to perform a variety of tasks. Although meta-learning is a method to endow neural networks with useful inductive biases, agents trained by meta-learning may sometimes acquire…

Extrapolation -- the ability to make inferences that go beyond the scope of one's experiences -- is a hallmark of human intelligence. By contrast, the generalization exhibited by contemporary neural network algorithms is largely limited to…

Computer Vision and Pattern Recognition · Computer Science 2023-09-08 Taylor W. Webb , Zachary Dulberg , Steven M. Frankland , Alexander A. Petrov , Randall C. O'Reilly , Jonathan D. Cohen

Borrowing from the transformer models that revolutionized the field of natural language processing, self-supervised feature learning for visual tasks has also seen state-of-the-art success using these extremely deep, isotropic networks.…

Computer Vision and Pattern Recognition · Computer Science 2021-06-01 George Cazenavette , Simon Lucey

Empirical studies have identified a range of learnability biases and limitations of transformers, such as a persistent difficulty in learning to compute simple formal languages such as PARITY, and a bias towards low-degree functions.…

Machine Learning · Computer Science 2024-05-28 Michael Hahn , Mark Rofin

This paper studies interpretable and fair artificial intelligence architectures for understanding English reading. Introduced transformer-based models, integrating advanced attention mechanisms and gradient-based feature attribution. The…

Computation and Language · Computer Science 2026-04-28 Ping Li

In inductive transfer learning, fine-tuning pre-trained convolutional networks substantially outperforms training from scratch. When using fine-tuning, the underlying assumption is that the pre-trained model extracts generic features, which…

Machine Learning · Computer Science 2018-06-07 Xuhong Li , Yves Grandvalet , Franck Davoine

Models need appropriate inductive biases to effectively learn from small amounts of data and generalize systematically outside of the training distribution. While Transformers are highly versatile and powerful, they can still benefit from…

Computation and Language · Computer Science 2024-07-08 Matthias Lindemann , Alexander Koller , Ivan Titov

Machine learning systems, especially with overparameterized deep neural networks, can generalize to novel test instances drawn from the same distribution as the training data. However, they fare poorly when evaluated on out-of-support test…

Machine Learning · Computer Science 2023-04-28 Aviv Netanyahu , Abhishek Gupta , Max Simchowitz , Kaiqing Zhang , Pulkit Agrawal

People use rich prior knowledge about the world in order to efficiently learn new concepts. These priors - also known as "inductive biases" - pertain to the space of internal models considered by a learner, and they help the learner make…

Computation and Language · Computer Science 2018-06-20 Reuben Feinman , Brenden M. Lake

In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence. The problem of interpretation is well…

Machine Learning · Computer Science 2019-09-30 Michael Iuzzolino , Yoram Singer , Michael C. Mozer

Can general-purpose AI architectures go beyond prediction to discover the physical laws governing the universe? True intelligence relies on "world models" -- causal abstractions that allow an agent to not only predict future states but…

Machine Learning · Computer Science 2026-02-09 Ziming Liu , Sophia Sanborn , Surya Ganguli , Andreas Tolias

An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is…

Machine Learning · Statistics 2024-04-16 Awni Altabaa , Taylor Webb , Jonathan Cohen , John Lafferty

While current deep learning systems excel at tasks such as object classification, language processing, and gameplay, few can construct or modify a complex system such as a tower of blocks. We hypothesize that what these systems lack is a…

The bias/variance tradeoff is fundamental to learning: increasing a model's complexity can improve its fit on training data, but potentially worsens performance on future samples. Remarkably, however, the human brain effortlessly handles a…

Neurons and Cognition · Quantitative Biology 2012-10-18 David Balduzzi

The transformer is a neural network component that can be used to learn useful representations of sequences or sets of data-points. The transformer has driven recent advances in natural language processing, computer vision, and…

Machine Learning · Computer Science 2026-01-21 Richard E. Turner
‹ Prev 1 2 3 10 Next ›