Related papers: Flag Varieties: A Geometric Framework for Deep Net…

Understanding symmetries in deep networks

Recent works have highlighted scale invariance or symmetry present in the weight space of a typical deep network and the adverse effect it has on the Euclidean gradient based stochastic gradient descent optimization. In this work, we show…

Machine Learning · Computer Science 2015-11-04 Vijay Badrinarayanan , Bamdev Mishra , Roberto Cipolla

Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

Deep neural networks (DNNs) at convergence consistently represent the training data in the last layer via a highly symmetric geometric structure referred to as neural collapse. This empirical evidence has spurred a line of theoretical…

Machine Learning · Computer Science 2024-10-08 Arthur Jacot , Peter Súkeník , Zihan Wang , Marco Mondelli

Equivariant Deep Weight Space Alignment

Permutation symmetries of deep networks make basic operations like model merging and similarity estimation challenging. In many cases, aligning the weights of the networks, i.e., finding optimal permutations between their weights, is…

Machine Learning · Computer Science 2024-11-12 Aviv Navon , Aviv Shamsian , Ethan Fetaya , Gal Chechik , Nadav Dym , Haggai Maron

The flag manifold as a tool for analyzing and comparing data sets

The shape and orientation of data clouds reflect variability in observations that can confound pattern recognition systems. Subspace methods, utilizing Grassmann manifolds, have been a great aid in dealing with such variability. However,…

Computer Vision and Pattern Recognition · Computer Science 2020-06-26 Xiaofeng Ma , Michael Kirby , Chris Peterson

Revealing the Structure of Deep Neural Networks via Convex Duality

We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of the hidden layers. We show that a set of optimal hidden layer weights for a norm regularized DNN training problem…

Machine Learning · Computer Science 2021-06-14 Tolga Ergen , Mert Pilanci

Geometric and Dynamic Scaling in Deep Transformers

Despite their empirical success, pushing Transformer architectures to extreme depth often leads to a paradoxical failure: representations become increasingly redundant, lose rank, and ultimately collapse. Existing explanations largely…

Machine Learning · Computer Science 2026-01-16 Haoran Su , Chenyu You

Deep Neural Regression Collapse

Neural Collapse is a phenomenon that helps identify sparse and low rank structures in deep classifiers. Recent work has extended the definition of neural collapse to regression problems, albeit only measuring the phenomenon at the last…

Machine Learning · Computer Science 2026-03-26 Akshay Rangamani , Altay Unal

Graph Alignment via Dual-Pass Spectral Encoding and Latent Space Communication

Graph alignment, the problem of identifying corresponding nodes across multiple graphs, is fundamental to numerous applications. Most existing unsupervised methods embed node features into latent representations to enable cross-graph…

Machine Learning · Computer Science 2025-09-30 Maysam Behmanesh , Erkan Turan , Maks Ovsjanikov

Unified Sparse-Matrix Representations for Diverse Neural Architectures

Deep neural networks employ specialized architectures for vision, sequential and language tasks, yet this proliferation obscures their underlying commonalities. We introduce a unified matrix-order framework that casts convolutional,…

Machine Learning · Computer Science 2025-07-24 Yuzhou Zhu

Why Geometric Continuity Emerges in Deep Neural Networks: Residual Connections and Rotational Symmetry Breaking

Weight matrices in deep networks exhibit geometric continuity -- principal singular vectors of adjacent layers point in similar directions. While this property has been widely observed, its origin remains unexplained. Through experiments on…

Machine Learning · Computer Science 2026-05-07 Kyungwon Jeong , Won-Gi Paeng , Honggyo Suh

Measuring the Representational Alignment of Neural Systems in Superposition

Comparing the internal representations of neural networks is a central goal in both neuroscience and machine learning. Standard alignment metrics operate on raw neural activations, implicitly assuming that similar representations produce…

Machine Learning · Computer Science 2026-04-02 Sunny Liu , Habon Issa , André Longon , Liv Gorton , Meenakshi Khosla , David Klindt

Geometry-induced Regularization in Deep ReLU Neural Networks

Neural networks with a large number of parameters often do not overfit, owing to implicit regularization that favors \lq good\rq{} networks. Other related and puzzling phenomena include properties of flat minima, saddle-to-saddle dynamics,…

Artificial Intelligence · Computer Science 2026-01-06 Joachim Bona-Pellissier , François Malgouyres , François Bachoc

Imbalance Trouble: Revisiting Neural-Collapse Geometry

Neural Collapse refers to the remarkable structural properties characterizing the geometry of class embeddings and classifier weights, found by deep nets when trained beyond zero training error. However, this characterization only holds for…

Machine Learning · Computer Science 2022-08-12 Christos Thrampoulidis , Ganesh R. Kini , Vala Vakilian , Tina Behnia

Superposition in Graph Neural Networks

Interpreting graph neural networks (GNNs) is difficult because message passing mixes signals and internal channels rarely align with human concepts. We study superposition, the sharing of directions by multiple features, directly in the…

Machine Learning · Computer Science 2026-01-19 Lukas Pertl , Han Xuanyuan , Pietro Liò

Random Sparse Lifts: Construction, Analysis and Convergence of finite sparse networks

We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows. Distinct from the fixed-space global…

Optimization and Control · Mathematics 2025-01-13 David A. R. Robin , Kevin Scaman , Marc Lelarge

Learning Topology-Driven Multi-Subspace Fusion for Grassmannian Deep Network

Grassmannian manifold offers a powerful carrier for geometric representation learning by modelling high-dimensional data as low-dimensional subspaces. However, existing approaches predominantly rely on static single-subspace…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Xuan Yu , Tianyang Xu

Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

Machine learning methods are commonly used to solve inverse problems, wherein an unknown signal must be estimated from few indirect measurements generated via a known acquisition procedure. In particular, neural networks perform well…

Machine Learning · Computer Science 2025-12-05 Hannah Laus , Suzanna Parkinson , Vasileios Charisopoulos , Felix Krahmer , Rebecca Willett

Nested subspace learning with flags

Many machine learning methods look for low-dimensional representations of the data. The underlying subspace can be estimated by first choosing a dimension $q$ and then optimizing a certain objective function over the space of…

Machine Learning · Statistics 2025-12-19 Tom Szwagier , Xavier Pennec

Symmetry-invariant optimization in deep networks

Recent works have highlighted scale invariance or symmetry that is present in the weight space of a typical deep network and the adverse effect that it has on the Euclidean gradient based stochastic gradient descent optimization. In this…

Machine Learning · Computer Science 2015-11-10 Vijay Badrinarayanan , Bamdev Mishra , Roberto Cipolla

Unsupervised Scale-Invariant Multispectral Shape Matching

Alignment between non-rigid stretchable structures is one of the most challenging tasks in computer vision, as the invariant properties are hard to define, and there is no labeled data for real datasets. We present unsupervised neural…

Computer Vision and Pattern Recognition · Computer Science 2022-08-30 Idan Pazi , Dvir Ginzburg , Dan Raviv