Related papers: Explaining Neural Matrix Factorization with Gradie…

Understanding Black-box Predictions via Influence Functions

How can we explain the predictions of a black-box model? In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data,…

Machine Learning · Statistics 2021-01-01 Pang Wei Koh , Percy Liang

Implicit Regularization in Deep Matrix Factorization

Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low "complexity." We study the implicit…

Machine Learning · Computer Science 2019-10-29 Sanjeev Arora , Nadav Cohen , Wei Hu , Yuping Luo

Estimating Training Data Influence by Tracing Gradient Descent

We introduce a method called TracIn that computes the influence of a training example on a prediction made by the model. The idea is to trace how the loss on the test point changes during the training process whenever the training example…

Machine Learning · Computer Science 2020-11-17 Garima Pruthi , Frederick Liu , Mukund Sundararajan , Satyen Kale

Gradient Descent for Deep Matrix Factorization: Dynamics and Implicit Bias towards Low Rank

In deep learning, it is common to use more network parameters than training points. In such scenarioof over-parameterization, there are usually multiple networks that achieve zero training error so that thetraining algorithm induces an…

Machine Learning · Computer Science 2023-08-22 Hung-Hsu Chou , Carsten Gieshoff , Johannes Maly , Holger Rauhut

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still…

Machine Learning · Computer Science 2018-02-27 Will Grathwohl , Dami Choi , Yuhuai Wu , Geoffrey Roeder , David Duvenaud

The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Large-scale black-box models have become ubiquitous across numerous applications. Understanding the influence of individual training data sources on predictions made by these models is crucial for improving their trustworthiness. Current…

Machine Learning · Computer Science 2024-06-21 Myeongseob Ko , Feiyang Kang , Weiyan Shi , Ming Jin , Zhou Yu , Ruoxi Jia

Gradient Routing: Masking Gradients to Localize Computation in Neural Networks

Neural networks are trained primarily based on their inputs and outputs, without regard for their internal mechanisms. These neglected mechanisms determine properties that are critical for safety, like (i) transparency; (ii) the absence of…

Machine Learning · Computer Science 2024-12-02 Alex Cloud , Jacob Goldman-Wetzler , Evžen Wybitul , Joseph Miller , Alexander Matt Turner

Neural network gradient-based learning of black-box function interfaces

Deep neural networks work well at approximating complicated functions when provided with data and trained by gradient descent methods. At the same time, there is a vast amount of existing functions that programmatically solve different…

Machine Learning · Computer Science 2019-01-15 Alon Jacovi , Guy Hadash , Einat Kermany , Boaz Carmeli , Ofer Lavi , George Kour , Jonathan Berant

Toward Understanding the Disagreement Problem in Neural Network Feature Attribution

In recent years, neural networks have demonstrated their remarkable ability to discern intricate patterns and relationships from raw data. However, understanding the inner workings of these black box models remains challenging, yet crucial…

Machine Learning · Statistics 2024-04-18 Niklas Koenen , Marvin N. Wright

Estimating Implicit Regularization in Deep Learning

Deep learning systems are known to exhibit implicit regularization (alt. implicit bias), favoring simple solutions instead of merely minimizing the loss function. In some cases, we can analytically derive the implicit regularization --…

Machine Learning · Statistics 2026-05-08 Joseph H. Rudoler , Kevin Tan , Giles Hooker , Konrad P. Kording

Random Feedback Alignment Algorithms to train Neural Networks: Why do they Align?

Feedback alignment algorithms are an alternative to backpropagation to train neural networks, whereby some of the partial derivatives that are required to compute the gradient are replaced by random terms. This essentially transforms the…

Machine Learning · Computer Science 2023-06-06 Dominique Chu , Florian Bacho

Neural Variational Inference and Learning in Undirected Graphical Models

Many problems in machine learning are naturally expressed in the language of undirected graphical models. Here, we propose black-box learning and inference algorithms for undirected models that optimize a variational approximation to the…

Machine Learning · Computer Science 2017-11-20 Volodymyr Kuleshov , Stefano Ermon

Deep Grey-Box Modeling With Adaptive Data-Driven Models Toward Trustworthy Estimation of Theory-Driven Models

The combination of deep neural nets and theory-driven models, which we call deep grey-box modeling, can be inherently interpretable to some extent thanks to the theory backbone. Deep grey-box models are usually learned with a regularized…

Machine Learning · Computer Science 2022-10-25 Naoya Takeishi , Alexandros Kalousis

Efficient Estimation of Influence of a Training Instance

Understanding the influence of a training instance on a neural network model leads to improving interpretability. However, it is difficult and inefficient to evaluate the influence, which shows how a model's prediction would be changed if a…

Machine Learning · Computer Science 2021-11-22 Sosuke Kobayashi , Sho Yokoi , Jun Suzuki , Kentaro Inui

Toward Efficient Influence Function: Dropout as a Compression Tool

Assessing the impact the training data on machine learning models is crucial for understanding the behavior of the model, enhancing the transparency, and selecting training data. Influence function provides a theoretical framework for…

Machine Learning · Computer Science 2026-04-21 Yuchen Zhang , Mohammad Mohammadi Amiri

Why gradient clipping accelerates training: A theoretical justification for adaptivity

We provide a theoretical explanation for the effectiveness of gradient clipping in training deep neural networks. The key ingredient is a new smoothness condition derived from practical neural network training examples. We observe that…

Optimization and Control · Mathematics 2020-02-12 Jingzhao Zhang , Tianxing He , Suvrit Sra , Ali Jadbabaie

Some Insights into the Geometry and Training of Neural Networks

Neural networks have been successfully used for classification tasks in a rapidly growing number of practical applications. Despite their popularity and widespread use, there are still many aspects of training and classification that are…

Machine Learning · Computer Science 2016-05-03 Ewout van den Berg

Matrix Factorization via Deep Learning

Matrix completion is one of the key problems in signal processing and machine learning. In recent years, deep-learning-based models have achieved state-of-the-art results in matrix completion. Nevertheless, they suffer from two drawbacks:…

Machine Learning · Computer Science 2018-12-05 Duc Minh Nguyen , Evaggelia Tsiligianni , Nikos Deligiannis

Gradient Estimation Using Stochastic Computation Graphs

In a variety of problems originating in supervised, unsupervised, and reinforcement learning, the loss function is defined by an expectation over a collection of random variables, which might be part of a probabilistic model or the external…

Machine Learning · Computer Science 2016-01-06 John Schulman , Nicolas Heess , Theophane Weber , Pieter Abbeel

A Random Matrix Theory Approach to Damping in Deep Learning

We conjecture that the inherent difference in generalisation between adaptive and non-adaptive gradient methods in deep learning stems from the increased estimation noise in the flattest directions of the true loss surface. We demonstrate…

Machine Learning · Statistics 2022-03-17 Diego Granziol , Nicholas Baskerville