English
Related papers

Related papers: Strong Model Collapse

200 papers

As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models…

Machine Learning · Computer Science 2024-06-03 Elvis Dohmatob , Yunzhen Feng , Pu Yang , Francois Charton , Julia Kempe

Synthetically-generated data plays an increasingly larger role in training large language models. However, while synthetic data has been found to be useful, studies have also shown that without proper curation it can cause LLM performance…

Machine Learning · Computer Science 2025-12-02 Kareem Amin , Sara Babakniya , Alex Bie , Weiwei Kong , Umar Syed , Sergei Vassilvitskii

Large language models with a huge number of parameters, when trained on near internet-sized number of tokens, have been empirically shown to obey neural scaling laws: specifically, their performance behaves predictably as a power law in…

Machine Learning · Computer Science 2022-11-01 Alexander Maloney , Daniel A. Roberts , James Sully

Recent work has identified simple empirical scaling laws for language models, linking compute budget, dataset size, model size, and autoregressive modeling loss. The validity of these simple power laws across orders of magnitude in model…

Machine Learning · Statistics 2021-09-27 Amélie Chatelain , Amine Djeghri , Daniel Hesslow , Julien Launay , Iacopo Poli

Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive…

Machine Learning · Computer Science 2024-10-28 Yunzhen Feng , Elvis Dohmatob , Pu Yang , Francois Charton , Julia Kempe

The phenomenon of model collapse, introduced in (Shumailov et al., 2023), refers to the deterioration in performance that occurs when new models are trained on synthetic data generated from previously trained models. This recursive training…

Machine Learning · Computer Science 2024-04-09 Mohamed El Amine Seddik , Suei-Wen Chen , Soufiane Hayou , Pierre Youssef , Merouane Debbah

In the era of proliferation of large language and image generation models, the phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time,…

Machine Learning · Computer Science 2024-05-02 Elvis Dohmatob , Yunzhen Feng , Julia Kempe

Large Language Models (LLMs) that undergo recursive training on synthetically generated data are susceptible to model collapse, a phenomenon marked by the generation of meaningless output. Existing research has examined this issue from…

Computation and Language · Computer Science 2026-03-17 Konstantinos F. Xylogiannopoulos , Petros Xanthopoulos , Panagiotis Karampelas , Georgios A. Bakamitsos

High-quality data is essential for training large generative models, yet the vast reservoir of real data available online has become nearly depleted. Consequently, models increasingly generate their own data for further training, forming…

Machine Learning · Computer Science 2025-02-27 Shi Fu , Yingjie Wang , Yuzhu Chen , Xinmei Tian , Dacheng Tao

Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists…

Machine Learning · Computer Science 2025-06-11 Licong Lin , Jingfeng Wu , Sham M. Kakade , Peter L. Bartlett , Jason D. Lee

Large language models (LLMs) achieve strong performance across diverse tasks, largely driven by high-quality web data used in pre-training. However, recent studies indicate this data source is rapidly depleting. Synthetic data emerges as a…

Traditional scaling laws in natural language processing suggest that increasing model size and training data enhances performance. However, recent studies reveal deviations, particularly in large language models, where performance…

Machine Learning · Computer Science 2025-07-16 Zhengyu Chen , Siqi Wang , Teng Xiao , Yudong Wang , Shiqi Chen , Xunliang Cai , Junxian He , Jingang Wang

We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven…

Scaling laws guide the development of large language models (LLMs) by offering estimates for the optimal balance of model size, tokens, and compute. More recently, loss-to-loss scaling laws that relate losses across pretraining datasets and…

Machine Learning · Computer Science 2026-05-21 Prasanna Mayilvahanan , Thaddäus Wiedemer , Sayak Mallick , Matthias Bethge , Wieland Brendel

Recent large language models have been trained on vast datasets, but also often on repeated data, either intentionally for the purpose of upweighting higher quality data, or unintentionally because data deduplication is not perfect and the…

Large language models (LLMs) have made remarkable advances in recent years, with scaling laws playing a critical role in this rapid progress. In this paper, we empirically investigate how a critical hyper-parameter, i.e., the global batch…

Computation and Language · Computer Science 2024-12-03 Xian Shuai , Yiding Wang , Yimeng Wu , Xin Jiang , Xiaozhe Ren

Model collapse, a phenomenon characterized by performance degradation due to iterative training on synthetic data, has been widely studied. However, its implications for bias amplification, the progressive intensification of pre-existing…

Artificial Intelligence · Computer Science 2025-05-23 Ze Wang , Zekun Wu , Jeremy Zhang , Xin Guan , Navya Jain , Skylar Lu , Saloni Gupta , Adriano Koshiyama

In recent years, model collapse has become a critical issue in language model training, making it essential to understand the underlying mechanisms driving this phenomenon. In this paper, we investigate recursive parametric model training…

Machine Learning · Statistics 2025-05-23 Shirong Xu , Hengzhi He , Guang Cheng

Scaling laws are useful guides for derisking expensive training runs, as they predict performance of large models using cheaper, small-scale experiments. However, there remain gaps between current scaling studies and how language models are…

As Large Language Models (LLMs) become increasingly prevalent, their generated outputs are proliferating across the web, risking a future where machine-generated content dilutes human-authored text. Since online data is the primary resource…

Computation and Language · Computer Science 2025-09-23 George Drayson , Emine Yilmaz , Vasileios Lampos
‹ Prev 1 2 3 10 Next ›