English
Related papers

Related papers: Do Data-based Curricula Work?

200 papers

Inspired by human learning, researchers have proposed ordering examples during training based on their difficulty. Both curriculum learning, exposing a network to easier examples early in training, and anti-curriculum learning, showing the…

Machine Learning · Computer Science 2021-02-10 Xiaoxia Wu , Ethan Dyer , Behnam Neyshabur

We employ a characterization of linguistic complexity from psycholinguistic and language acquisition research to develop data-driven curricula to understand the underlying linguistic knowledge that models learn to address NLP tasks. The…

Computation and Language · Computer Science 2023-11-01 Mohamed Elgaar , Hadi Amiri

Curriculum learning (CL) posits that machine learning models -- similar to humans -- may learn more efficiently from data that match their current learning progress. However, CL methods are still poorly understood and, in particular for…

Machine Learning · Computer Science 2023-08-24 Lucas Weber , Jaap Jumelet , Paul Michel , Elia Bruni , Dieuwke Hupkes

In humans and animals, curriculum learning -- presenting data in a curated order - is critical to rapid learning and effective pedagogy. Yet in machine learning, curricula are not widely used and empirically often yield only moderate…

Machine Learning · Computer Science 2022-12-07 Luca Saglietti , Stefano Sarao Mannelli , Andrew Saxe

Machine translation systems based on deep neural networks are expensive to train. Curriculum learning aims to address this issue by choosing the order in which samples are presented during training to help train better models faster. We…

Computation and Language · Computer Science 2018-11-05 Xuan Zhang , Gaurav Kumar , Huda Khayrallah , Kenton Murray , Jeremy Gwinnup , Marianna J Martindale , Paul McNamee , Kevin Duh , Marine Carpuat

Curriculum learning, a training technique where data is presented to the model in order of example difficulty (e.g., from simpler to more complex documents), has shown limited success for pre-training language models. In this work, we…

Computation and Language · Computer Science 2025-09-29 Loris Schoenegger , Lukas Thoma , Terra Blevins , Benjamin Roth

A curriculum is a planned sequence of learning materials and an effective one can make learning efficient and effective for both humans and machines. Recent studies developed effective data-driven curriculum learning approaches for training…

Machine Learning · Computer Science 2023-07-19 Nidhi Vakil , Hadi Amiri

Curriculum learning provides a systematic approach to training. It refines training progressively, tailors training to task requirements, and improves generalization through exposure to diverse examples. We present a curriculum learning…

Computation and Language · Computer Science 2023-11-23 Nidhi Vakil , Hadi Amiri

Curriculum learning (CL) aims to improve training by presenting data from "easy" to "hard", yet defining and measuring linguistic difficulty remains an open challenge. We investigate whether human-curated simple language can serve as an…

Computation and Language · Computer Science 2025-08-28 Vanessa Toborek , Sebastian Müller , Tim Selbach , Tamás Horváth , Christian Bauckhage

The rapid advancement of Large Language Models (LLMs) has improved text understanding and generation but poses challenges in computational resources. This study proposes a curriculum learning-inspired, data-centric training strategy that…

Computation and Language · Computer Science 2024-05-14 Jisu Kim , Juhwan Lee

Recent advancements in data-to-text generation largely take on the form of neural end-to-end systems. Efforts have been dedicated to improving text generation systems by changing the order of training samples in a process known as…

Computation and Language · Computer Science 2021-02-09 Ernie Chang , Hui-Syuan Yeh , Vera Demberg

Curriculum learning (CL) - ordering training data from easy to hard - has become a popular strategy for improving reasoning in large language models (LLMs). Yet prior work employs disparate difficulty metrics and training setups, leaving…

Machine Learning · Computer Science 2025-10-28 Yaning Jia , Chunhui Zhang , Xingjian Diao , Xiangchi Yuan , Zhongyu Ouyang , Chiyu Ma , Soroush Vosoughi

Training machine learning models in a meaningful order, from the easy samples to the hard ones, using curriculum learning can provide performance improvements over the standard training approach based on random data shuffling, without any…

Machine Learning · Computer Science 2022-04-12 Petru Soviany , Radu Tudor Ionescu , Paolo Rota , Nicu Sebe

Curriculum Learning emphasizes the order of training instances in a computational learning setup. The core hypothesis is that simpler instances should be learned early as building blocks to learn more complex ones. Despite its usefulness,…

Computation and Language · Computer Science 2016-11-21 Volkan Cirik , Eduard Hovy , Louis-Philippe Morency

Training neural networks is traditionally done by providing a sequence of random mini-batches sampled uniformly from the entire training data. In this work, we analyze the effect of curriculum learning, which involves the non-uniform…

Machine Learning · Computer Science 2020-12-03 Guy Hacohen , Daphna Weinshall

Recent advances in deep learning techniques have achieved remarkable performance in several computer vision problems. A notably intuitive technique called Curriculum Learning (CL) has been introduced recently for training deep learning…

Computer Vision and Pattern Recognition · Computer Science 2024-01-17 Muhammad Asif Khan , Hamid Menouar , Ridha Hamila

Deep reinforcement learning (RL) has shown great empirical successes, but suffers from brittleness and sample inefficiency. A potential remedy is to use a previously-trained policy as a source of supervision. In this work, we refer to these…

Machine Learning · Computer Science 2021-09-16 Daniel Seita , Abhinav Gopal , Zhao Mandi , John Canny

Neural Machine Translation (NMT) models are typically trained on heterogeneous data that are concatenated and randomly shuffled. However, not all of the training data are equally useful to the model. Curriculum training aims to present the…

Computation and Language · Computer Science 2022-03-29 Tasnim Mohiuddin , Philipp Koehn , Vishrav Chaudhary , James Cross , Shruti Bhosale , Shafiq Joty

Curriculum learning-organizing training data from easy to hard-has improved efficiency across machine learning domains, yet remains underexplored for language model pretraining. We present the first systematic investigation of curriculum…

Computation and Language · Computer Science 2026-01-29 Yang Zhang , Amr Mohamed , Hadi Abdine , Guokan Shang , Michalis Vazirgiannis

It is common knowledge that the quantity and quality of the training data play a significant role in the creation of a good machine learning model. In this paper, we take it one step further and demonstrate that the way the training…

Audio and Speech Processing · Electrical Eng. & Systems 2022-08-12 Georgios Karakasidis , Tamás Grósz , Mikko Kurimo
‹ Prev 1 2 3 10 Next ›