Related papers: Memorization and Generalization in Neural Code Int…

A Closer Look at Memorization in Deep Networks

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize…

Machine Learning · Statistics 2017-07-04 Devansh Arpit , Stanisław Jastrzębski , Nicolas Ballas , David Krueger , Emmanuel Bengio , Maxinder S. Kanwal , Tegan Maharaj , Asja Fischer , Aaron Courville , Yoshua Bengio , Simon Lacoste-Julien

Memorization in deep learning: A survey

Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have…

Machine Learning · Computer Science 2024-06-07 Jiaheng Wei , Yanjun Zhang , Leo Yu Zhang , Ming Ding , Chao Chen , Kok-Leong Ong , Jun Zhang , Yang Xiang

Generalizability of Memorization Neural Networks

The neural network memorization problem is to study the expressive power of neural networks to interpolate a finite dataset. Although memorization is widely believed to have a close relationship with the strong generalizability of deep…

Machine Learning · Computer Science 2024-11-04 Lijia Yu , Xiao-Shan Gao , Lijun Zhang , Yibo Miao

Why Deep Learning Generalizes

Very large deep learning models trained using gradient descent are remarkably resistant to memorization given their huge capacity, but are at the same time capable of fitting large datasets of pure noise. Here methods are introduced by…

Machine Learning · Computer Science 2022-12-22 Benjamin L. Badger

Unveiling Memorization in Code Models

The availability of large-scale datasets, advanced architectures, and powerful computational resources have led to effective code models that automate diverse software engineering activities. The datasets usually consist of billions of…

Software Engineering · Computer Science 2024-01-15 Zhou Yang , Zhipeng Zhao , Chenyu Wang , Jieke Shi , Dongsun Kim , DongGyun Han , David Lo

On the Over-Memorization During Natural, Robust and Catastrophic Overfitting

Overfitting negatively impacts the generalization ability of deep neural networks (DNNs) in both natural and adversarial training. Existing methods struggle to consistently address different types of overfitting, typically designing…

Machine Learning · Computer Science 2024-09-17 Runqi Lin , Chaojian Yu , Bo Han , Tongliang Liu

Decoding Generalization from Memorization in Deep Neural Networks

Overparameterized deep networks that generalize well have been key to the dramatic success of deep learning in recent years. The reasons for their remarkable ability to generalize are not well understood yet. When class labels in the…

Machine Learning · Computer Science 2026-02-03 Simran Ketha , Venkatakrishnan Ramaswamy

Rethink the Connections among Generalization, Memorization and the Spectral Bias of DNNs

Over-parameterized deep neural networks (DNNs) with sufficient capacity to memorize random noise can achieve excellent generalization performance, challenging the bias-variance trade-off in classical learning theory. Recent studies claimed…

Machine Learning · Computer Science 2022-11-15 Xiao Zhang , Haoyi Xiong , Dongrui Wu

Reason to Rote: Rethinking Memorization in Reasoning

Large language models readily memorize arbitrary training instances, such as label noise, yet they perform strikingly well on reasoning tasks. In this work, we investigate how language models memorize label noise, and why such memorization…

Computation and Language · Computer Science 2025-10-03 Yupei Du , Philipp Mondorf , Silvia Casola , Yuekun Yao , Robert Litschko , Barbara Plank

Can Neural Network Memorization Be Localized?

Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model. Memorization refers…

Machine Learning · Computer Science 2023-07-20 Pratyush Maini , Michael C. Mozer , Hanie Sedghi , Zachary C. Lipton , J. Zico Kolter , Chiyuan Zhang

Memorize or Generalize? Evaluating LLM Code Generation with Code Rewriting

Large language models (LLMs) have recently demonstrated exceptional code generation capabilities. However, there is a growing debate whether LLMs are mostly doing memorization (i.e., replicating or reusing large parts of their training…

Artificial Intelligence · Computer Science 2025-10-01 Lizhe Zhang , Wentao Chen , Li Zhong , Letian Peng , Zilong Wang , Jingbo Shang

Unveiling Memorization-Generalization Coexistence: A Case Study on Arithmetic Tasks with Label Noise

Highly over-parameterized models can simultaneously memorize noisy labels and generalize well, yet how these behaviors coexist remains poorly understood. In this work, we investigate the underlying mechanisms of this coexistence using…

Machine Learning · Computer Science 2026-05-19 Linyu Liu , Pinyan Lu

Mitigating Memorization of Noisy Labels via Regularization between Representations

Designing robust loss functions is popular in learning with noisy labels while existing designs did not explicitly consider the overfitting property of deep neural networks (DNNs). As a result, applying these losses may still suffer from…

Machine Learning · Computer Science 2022-05-27 Hao Cheng , Zhaowei Zhu , Xing Sun , Yang Liu

A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes

Deep neural networks (DNNs) are typically optimized using various forms of mini-batch gradient descent algorithm. A major motivation for mini-batch gradient descent is that with a suitably chosen batch size, available computing resources…

Machine Learning · Computer Science 2022-10-25 Oyebade K. Oyedotun , Konstantinos Papadopoulos , Djamila Aouada

Traces of Memorisation in Large Language Models for Code

Large language models have gained significant popularity because of their ability to generate human-like text and potential applications in various fields, such as Software Engineering. Large language models for code are commonly trained on…

Cryptography and Security · Computer Science 2024-01-17 Ali Al-Kaswan , Maliheh Izadi , Arie van Deursen

CodNN -- Robust Neural Networks From Coded Classification

Deep Neural Networks (DNNs) are a revolutionary force in the ongoing information revolution, and yet their intrinsic properties remain a mystery. In particular, it is widely known that DNNs are highly sensitive to noise, whether adversarial…

Machine Learning · Computer Science 2020-05-01 Netanel Raviv , Siddharth Jain , Pulakesh Upadhyaya , Jehoshua Bruck , Anxiao Jiang

Noisy Concurrent Training for Efficient Learning under Label Noise

Deep neural networks (DNNs) fail to learn effectively under label noise and have been shown to memorize random labels which affect their generalization performance. We consider learning in isolation, using one-hot encoded labels as the sole…

Computer Vision and Pattern Recognition · Computer Science 2020-09-18 Fahad Sarfraz , Elahe Arani , Bahram Zonooz

The Curious Case of Benign Memorization

Despite the empirical advances of deep learning across a variety of learning tasks, our theoretical understanding of its success is still very restricted. One of the key challenges is the overparametrized nature of modern models, enabling…

Machine Learning · Computer Science 2023-02-24 Sotiris Anagnostidis , Gregor Bachmann , Lorenzo Noci , Thomas Hofmann

Improving Generalization by Controlling Label-Noise Information in Neural Network Weights

In the presence of noisy or incorrect labels, neural networks have the undesirable tendency to memorize information about the noise. Standard regularization techniques such as dropout, weight decay or data augmentation sometimes help, but…

Machine Learning · Computer Science 2020-11-23 Hrayr Harutyunyan , Kyle Reing , Greg Ver Steeg , Aram Galstyan

QGen: On the Ability to Generalize in Quantization Aware Training

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a…

Machine Learning · Computer Science 2024-04-22 MohammadHossein AskariHemmat , Ahmadreza Jeddi , Reyhane Askari Hemmat , Ivan Lazarevich , Alexander Hoffman , Sudhakar Sah , Ehsan Saboori , Yvon Savaria , Jean-Pierre David