Related papers: When Representations Align: Universality in Repres…
It is widely believed that learning good representations is one of the main reasons for the success of deep neural networks. Although highly intuitive, there is a lack of theory and systematic approach quantitatively characterizing what…
While modern deep networks have demonstrated remarkable versatility, their training dynamics remain poorly understood--often driven more by empirical tweaks than architectural insight. This paper investigates how internal structural choices…
A key factor in the success of deep neural networks is the ability to scale models to improve performance by varying the architecture depth and width. This simple property of neural network design has resulted in highly effective…
Deep neural networks use multiple layers of functions to map an object represented by an input vector progressively to different representations, and with sufficient training, eventually to a single score for each class that is the output…
Representations of the world environment play a crucial role in artificial intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images.…
Biological and artificial neural networks develop internal representations that enable them to perform complex tasks. In artificial networks, the effectiveness of these models relies on their ability to build task specific representation, a…
Our theoretical understanding of deep learning has not kept pace with its empirical success. While network architecture is known to be critical, we do not yet understand its effect on learned representations and network behavior, or how…
The internal representations learned by deep networks are often sensitive to architecture-specific choices, raising questions about the stability, alignment, and transferability of learned structure across models. In this paper, we…
In comparison to classical shallow representation learning techniques, deep neural networks have achieved superior performance in nearly every application benchmark. But despite their clear empirical advantages, it is still not well…
In this paper, we elucidate how representations in deep neural networks (DNNs) evolve during training. Our focus is on overparameterized learning settings where the training continues much after the trained DNN starts to perfectly fit its…
Deep learning algorithms demonstrate a surprising ability to learn high-dimensional tasks from limited examples. This is commonly attributed to the depth of neural networks, enabling them to build a hierarchy of abstract, low-dimensional…
Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by…
To preserve previously learned representations, continual learning systems must strike a balance between plasticity, the ability to acquire new knowledge, and stability. This stability-plasticity dilemma affects how representations can be…
The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind…
Despite the remarkable success of large large-scale neural networks, we still lack unified notation for thinking about and describing their representational spaces. We lack methods to reliably describe how their representations are…
Machine Learning algorithms have had a profound impact on the field of computer science over the past few decades. These algorithms performance is greatly influenced by the representations that are derived from the data in the learning…
Integrating task-relevant information into neural representations is a fundamental ability of both biological and artificial intelligence systems. Recent theories have categorized learning into two regimes: the rich regime, where neural…
Neural models learn representations of high-dimensional data on low-dimensional manifolds. Multiple factors, including stochasticities in the training process, model architectures, and additional inductive biases, may induce different…
Representation learning, and interpreting learned representations, are key areas of focus in machine learning and neuroscience. Both fields generally use representations as a means to understand or improve a system's computations. In this…
Neural networks leverage robust internal representations in order to generalise. Learning them is difficult, and often requires a large training set that covers the data distribution densely. We study a common setting where our task is not…