Related papers: On implicit regularization: Morse functions and ap…
Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low "complexity." We study the implicit…
Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may…
Recent efforts to unravel the mystery of implicit regularization in deep learning have led to a theoretical focus on matrix factorization -- matrix completion via linear neural network. As a step further towards practical deep learning, we…
In the pursuit of explaining implicit regularization in deep learning, prominent focus was given to matrix and tensor factorizations, which correspond to simplified neural networks. It was shown that these models exhibit an implicit…
We study the implicit regularization effects of deep learning in tensor factorization. While implicit regularization in deep matrix and 'shallow' tensor factorization via linear and certain type of non-linear neural networks promotes…
In an attempt to better understand generalization in deep learning, we study several possible explanations. We show that implicit regularization induced by the optimization method is playing a key role in generalization and success of deep…
We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix $X$ with gradient descent on a factorization of $X$. We conjecture and provide empirical and theoretical evidence that with small enough…
Implicit regularization is an important way to interpret neural networks. Recent theory starts to explain implicit regularization with the model of deep matrix factorization (DMF) and analyze the trajectory of discrete gradient dynamics in…
In this paper we address the rotation synchronization problem, where the objective is to recover absolute rotations starting from pairwise ones, where the unknowns and the measures are represented as nodes and edges of a graph,…
Gradient descent for matrix factorization exhibits an implicit bias toward approximately low-rank solutions. While existing theories often assume the boundedness of iterates, empirically the bias persists even with unbounded sequences. This…
We investigate the problem of factorizing a matrix into several sparse matrices and propose an algorithm for this under randomness and sparsity assumptions. This problem can be viewed as a simplification of the deep learning problem where…
We argue that the optimization plays a crucial role in generalization of deep learning models through implicit regularization. We do this by demonstrating that generalization ability is not controlled by network size but rather by some…
We introduce meta-factorization, a theory that describes matrix decompositions as solutions of linear matrix equations: the projector and the reconstruction equation. Meta-factorization reconstructs known factorizations, reveals their…
Matrix factorization is a well-studied task in machine learning for compactly representing large, noisy data. In our approach, instead of using the traditional concept of matrix rank, we define a new notion of link-rank based on a…
Matrix factorization models have been extensively studied as a valuable test-bed for understanding the implicit biases of overparameterized models. Although both low nuclear norm and low rank regularization have been studied for these…
There is growing body of learning problems for which it is natural to organize the parameters into matrix, so as to appropriately regularize the parameters under some matrix norm (in order to impose some more sophisticated prior knowledge).…
Matrix factorization is a widely used approach for top-N recommendation and collaborative filtering. When implemented on implicit feedback data (such as clicks), a common heuristic is to upweight the observed interactions. This strategy has…
Works on implicit regularization have studied gradient trajectories during the optimization process to explain why deep networks favor certain kinds of solutions over others. In deep linear networks, it has been shown that gradient descent…
Multiple rotation averaging plays a crucial role in computer vision and robotics domains. The conventional optimization-based methods optimize a nonlinear cost function based on certain noise assumptions, while most previous learning-based…
In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task in deep learning theory. However, implicit regularization is itself not completely defined and well understood. In this work, we…