Related papers: Learning Randomized Algorithms with Transformers

Deep Randomized Neural Networks

Randomized Neural Networks explore the behavior of neural systems where the majority of connections are fixed, either in a stochastic or a deterministic fashion. Typical examples of such systems consist of multi-layered neural network…

Machine Learning · Computer Science 2021-02-03 Claudio Gallicchio , Simone Scardapane

Algorithmic Capabilities of Random Transformers

Trained transformer models have been found to implement interpretable procedures for tasks like arithmetic and associative recall, but little is understood about how the circuits that implement these procedures originate during training. To…

Machine Learning · Computer Science 2024-10-08 Ziqian Zhong , Jacob Andreas

Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning

Deep reinforcement learning (RL) agents often fail to generalize to unseen environments (yet semantically similar to trained agents), particularly when they are trained on high-dimensional state spaces, such as images. In this paper, we…

Machine Learning · Computer Science 2020-02-18 Kimin Lee , Kibok Lee , Jinwoo Shin , Honglak Lee

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

Much as replacing hand-designed features with learned functions has revolutionized how we solve perceptual tasks, we believe learned algorithms will transform how we train models. In this work we focus on general-purpose learned optimizers…

Machine Learning · Computer Science 2020-09-24 Luke Metz , Niru Maheswaranathan , C. Daniel Freeman , Ben Poole , Jascha Sohl-Dickstein

Learning Representations Robust to Group Shifts and Adversarial Examples

Despite the high performance achieved by deep neural networks on various tasks, extensive studies have demonstrated that small tweaks in the input could fail the model predictions. This issue of deep neural networks has led to a number of…

Machine Learning · Computer Science 2022-02-22 Ming-Chang Chiu , Xuezhe Ma

Learning Gradient Descent: Better Generalization and Longer Horizons

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Active Importance Sampling for Variational Objectives Dominated by Rare Events: Consequences for Optimization and Generalization

Deep neural networks, when optimized with sufficient data, provide accurate representations of high-dimensional functions; in contrast, function approximation techniques that have predominated in scientific computing do not scale well with…

Data Analysis, Statistics and Probability · Physics 2021-03-15 Grant M. Rotskoff , Andrew R. Mitchell , Eric Vanden-Eijnden

On the Role of Randomization in Adversarially Robust Classification

Deep neural networks are known to be vulnerable to small adversarial perturbations in test data. To defend against adversarial attacks, probabilistic classifiers have been proposed as an alternative to deterministic ones. However,…

Machine Learning · Computer Science 2023-11-29 Lucas Gnecco-Heredia , Yann Chevaleyre , Benjamin Negrevergne , Laurent Meunier , Muni Sreenivas Pydi

Neural Algorithmic Reasoning

Algorithms have been fundamental to recent global technological advances and, in particular, they have been the cornerstone of technical advances in one field rapidly being applied to another. We argue that algorithms possess fundamentally…

Machine Learning · Computer Science 2021-08-09 Petar Veličković , Charles Blundell

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

Reinforcement learning systems require good representations to work well. For decades practical success in reinforcement learning was limited to small domains. Deep reinforcement learning systems, on the other hand, are scalable, not…

Machine Learning · Computer Science 2020-03-18 Sina Ghiassian , Banafsheh Rafiee , Yat Long Lo , Adam White

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used…

Optimization and Control · Mathematics 2020-08-28 Filip Hanzely

Mitigating Adversarial Effects Through Randomization

Convolutional neural networks have demonstrated high accuracy on various tasks in recent years. However, they are extremely vulnerable to adversarial examples. For example, imperceptible perturbations added to clean images can cause…

Computer Vision and Pattern Recognition · Computer Science 2018-03-02 Cihang Xie , Jianyu Wang , Zhishuai Zhang , Zhou Ren , Alan Yuille

Learning with Differentiable Algorithms

Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path…

Machine Learning · Computer Science 2022-09-02 Felix Petersen

Deep transformation models: Tackling complex regression problems with neural network based transformation models

We present a deep transformation model for probabilistic regression. Deep learning is known for outstandingly accurate predictions on complex data but in regression tasks, it is predominantly used to just predict a single number. This…

Machine Learning · Statistics 2020-04-02 Beate Sick , Torsten Hothorn , Oliver Dürr

Adversarial Robustness as a Prior for Learned Representations

An important goal in deep learning is to learn versatile, high-level feature representations of input data. However, standard networks' representations seem to possess shortcomings that, as we illustrate, prevent them from fully realizing…

Machine Learning · Statistics 2019-09-30 Logan Engstrom , Andrew Ilyas , Shibani Santurkar , Dimitris Tsipras , Brandon Tran , Aleksander Madry

Reverse engineering learned optimizers reveals known and novel mechanisms

Learned optimizers are algorithms that can themselves be trained to solve optimization problems. In contrast to baseline optimizers (such as momentum or Adam) that use simple update rules derived from theoretical principles, learned…

Machine Learning · Computer Science 2021-12-09 Niru Maheswaranathan , David Sussillo , Luke Metz , Ruoxi Sun , Jascha Sohl-Dickstein

Understanding and correcting pathologies in the training of learned optimizers

Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially…

Neural and Evolutionary Computing · Computer Science 2019-06-11 Luke Metz , Niru Maheswaranathan , Jeremy Nixon , C. Daniel Freeman , Jascha Sohl-Dickstein

Generalization in Transfer Learning

Agents trained with deep reinforcement learning algorithms are capable of performing highly complex tasks including locomotion in continuous environments. We investigate transferring the learning acquired in one task to a set of previously…

Machine Learning · Computer Science 2024-03-06 Suzan Ece Ada , Emre Ugur , H. Levent Akin

Narrowing the Focus: Learned Optimizers for Pretrained Models

In modern deep learning, the models are learned by applying gradient updates using an optimizer, which transforms the updates based on various statistics. Optimizers are often hand-designed and tuning their hyperparameters is a big part of…

Machine Learning · Computer Science 2024-10-08 Gus Kristiansen , Mark Sandler , Andrey Zhmoginov , Nolan Miller , Anirudh Goyal , Jihwan Lee , Max Vladymyrov

Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples

Adversarial training, which is to enhance robustness against adversarial attacks, has received much attention because it is easy to generate human-imperceptible perturbations of data to deceive a given deep neural network. In this paper, we…

Machine Learning · Statistics 2023-06-02 Dongyoon Yang , Insung Kong , Yongdai Kim