English
Related papers

Related papers: Memorization and Generalization in Neural Code Int…

200 papers

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize…

Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have…

Machine Learning · Computer Science 2024-06-07 Jiaheng Wei , Yanjun Zhang , Leo Yu Zhang , Ming Ding , Chao Chen , Kok-Leong Ong , Jun Zhang , Yang Xiang

The neural network memorization problem is to study the expressive power of neural networks to interpolate a finite dataset. Although memorization is widely believed to have a close relationship with the strong generalizability of deep…

Machine Learning · Computer Science 2024-11-04 Lijia Yu , Xiao-Shan Gao , Lijun Zhang , Yibo Miao

Very large deep learning models trained using gradient descent are remarkably resistant to memorization given their huge capacity, but are at the same time capable of fitting large datasets of pure noise. Here methods are introduced by…

Machine Learning · Computer Science 2022-12-22 Benjamin L. Badger

The availability of large-scale datasets, advanced architectures, and powerful computational resources have led to effective code models that automate diverse software engineering activities. The datasets usually consist of billions of…

Software Engineering · Computer Science 2024-01-15 Zhou Yang , Zhipeng Zhao , Chenyu Wang , Jieke Shi , Dongsun Kim , DongGyun Han , David Lo

Overfitting negatively impacts the generalization ability of deep neural networks (DNNs) in both natural and adversarial training. Existing methods struggle to consistently address different types of overfitting, typically designing…

Machine Learning · Computer Science 2024-09-17 Runqi Lin , Chaojian Yu , Bo Han , Tongliang Liu

Overparameterized deep networks that generalize well have been key to the dramatic success of deep learning in recent years. The reasons for their remarkable ability to generalize are not well understood yet. When class labels in the…

Machine Learning · Computer Science 2026-02-03 Simran Ketha , Venkatakrishnan Ramaswamy

Over-parameterized deep neural networks (DNNs) with sufficient capacity to memorize random noise can achieve excellent generalization performance, challenging the bias-variance trade-off in classical learning theory. Recent studies claimed…

Machine Learning · Computer Science 2022-11-15 Xiao Zhang , Haoyi Xiong , Dongrui Wu

Large language models readily memorize arbitrary training instances, such as label noise, yet they perform strikingly well on reasoning tasks. In this work, we investigate how language models memorize label noise, and why such memorization…

Computation and Language · Computer Science 2025-10-03 Yupei Du , Philipp Mondorf , Silvia Casola , Yuekun Yao , Robert Litschko , Barbara Plank

Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model. Memorization refers…

Machine Learning · Computer Science 2023-07-20 Pratyush Maini , Michael C. Mozer , Hanie Sedghi , Zachary C. Lipton , J. Zico Kolter , Chiyuan Zhang

Large language models (LLMs) have recently demonstrated exceptional code generation capabilities. However, there is a growing debate whether LLMs are mostly doing memorization (i.e., replicating or reusing large parts of their training…

Artificial Intelligence · Computer Science 2025-10-01 Lizhe Zhang , Wentao Chen , Li Zhong , Letian Peng , Zilong Wang , Jingbo Shang

Highly over-parameterized models can simultaneously memorize noisy labels and generalize well, yet how these behaviors coexist remains poorly understood. In this work, we investigate the underlying mechanisms of this coexistence using…

Machine Learning · Computer Science 2026-05-19 Linyu Liu , Pinyan Lu

Designing robust loss functions is popular in learning with noisy labels while existing designs did not explicitly consider the overfitting property of deep neural networks (DNNs). As a result, applying these losses may still suffer from…

Machine Learning · Computer Science 2022-05-27 Hao Cheng , Zhaowei Zhu , Xing Sun , Yang Liu

Deep neural networks (DNNs) are typically optimized using various forms of mini-batch gradient descent algorithm. A major motivation for mini-batch gradient descent is that with a suitably chosen batch size, available computing resources…

Machine Learning · Computer Science 2022-10-25 Oyebade K. Oyedotun , Konstantinos Papadopoulos , Djamila Aouada

Large language models have gained significant popularity because of their ability to generate human-like text and potential applications in various fields, such as Software Engineering. Large language models for code are commonly trained on…

Cryptography and Security · Computer Science 2024-01-17 Ali Al-Kaswan , Maliheh Izadi , Arie van Deursen

Deep Neural Networks (DNNs) are a revolutionary force in the ongoing information revolution, and yet their intrinsic properties remain a mystery. In particular, it is widely known that DNNs are highly sensitive to noise, whether adversarial…

Machine Learning · Computer Science 2020-05-01 Netanel Raviv , Siddharth Jain , Pulakesh Upadhyaya , Jehoshua Bruck , Anxiao Jiang

Deep neural networks (DNNs) fail to learn effectively under label noise and have been shown to memorize random labels which affect their generalization performance. We consider learning in isolation, using one-hot encoded labels as the sole…

Computer Vision and Pattern Recognition · Computer Science 2020-09-18 Fahad Sarfraz , Elahe Arani , Bahram Zonooz

Despite the empirical advances of deep learning across a variety of learning tasks, our theoretical understanding of its success is still very restricted. One of the key challenges is the overparametrized nature of modern models, enabling…

Machine Learning · Computer Science 2023-02-24 Sotiris Anagnostidis , Gregor Bachmann , Lorenzo Noci , Thomas Hofmann

In the presence of noisy or incorrect labels, neural networks have the undesirable tendency to memorize information about the noise. Standard regularization techniques such as dropout, weight decay or data augmentation sometimes help, but…

Machine Learning · Computer Science 2020-11-23 Hrayr Harutyunyan , Kyle Reing , Greg Ver Steeg , Aram Galstyan

Quantization lowers memory usage, computational requirements, and latency by utilizing fewer bits to represent model weights and activations. In this work, we investigate the generalization properties of quantized neural networks, a…

‹ Prev 1 2 3 10 Next ›