English
Related papers

Related papers: Curriculum Learning for Small Code Language Models

200 papers

Learning-based techniques, especially advanced pre-trained models for code have demonstrated capabilities in code understanding and generation, solving diverse software engineering (SE) tasks. Despite the promising results, current training…

Software Engineering · Computer Science 2025-02-07 Kyi Shin Khant , Hong Yi Lin , Patanamon Thongtanunam

For specialized domains, there is often not a wealth of data with which to train large machine learning models. In such limited data / compute settings, various methods exist aiming to $\textit{do more with less}$, such as finetuning from a…

Machine Learning · Computer Science 2024-10-22 Rohan Saha , Abrar Fahim , Alona Fyshe , Alex Murphy

Curriculum learning-organizing training data from easy to hard-has improved efficiency across machine learning domains, yet remains underexplored for language model pretraining. We present the first systematic investigation of curriculum…

Computation and Language · Computer Science 2026-01-29 Yang Zhang , Amr Mohamed , Hadi Abdine , Guokan Shang , Michalis Vazirgiannis

Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, these models have extended their capabilities to coding tasks,…

Machine Learning · Computer Science 2024-10-23 Nishat Raihan , Mohammed Latif Siddiq , Joanna C. S. Santos , Marcos Zampieri

Language Models like ELMo and BERT have provided robust representations of natural language, which serve as the language understanding component for a diverse range of downstream tasks.Curriculum learning is a method that employs a…

Computation and Language · Computer Science 2021-08-05 Daniel Campos

Curriculum learning, a training technique where data is presented to the model in order of example difficulty (e.g., from simpler to more complex documents), has shown limited success for pre-training language models. In this work, we…

Computation and Language · Computer Science 2025-09-29 Loris Schoenegger , Lukas Thoma , Terra Blevins , Benjamin Roth

Recently, code-oriented large language models (LLMs) have demonstrated strong capabilities in translating natural language into executable code. Text-to-SQL is a significant application of this ability, enabling non-technical users to…

Artificial Intelligence · Computer Science 2026-04-21 Salmane Chafik , Saad Ezzini , Ismail Berrada

Existing curriculum learning approaches to Neural Machine Translation (NMT) require sampling sufficient amounts of "easy" samples from training data at the early training stage. This is not always achievable for low-resource languages where…

Computation and Language · Computer Science 2021-03-23 Chen Liang , Haoming Jiang , Xiaodong Liu , Pengcheng He , Weizhu Chen , Jianfeng Gao , Tuo Zhao

Curriculum Learning emphasizes the order of training instances in a computational learning setup. The core hypothesis is that simpler instances should be learned early as building blocks to learn more complex ones. Despite its usefulness,…

Computation and Language · Computer Science 2016-11-21 Volkan Cirik , Eduard Hovy , Louis-Philippe Morency

A curriculum is a planned sequence of learning materials and an effective one can make learning efficient and effective for both humans and machines. Recent studies developed effective data-driven curriculum learning approaches for training…

Machine Learning · Computer Science 2023-07-19 Nidhi Vakil , Hadi Amiri

The advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource…

Software Engineering · Computer Science 2025-02-03 Alessandro Giagnorio , Alberto Martin-Lopez , Gabriele Bavota

Curriculum learning provides a systematic approach to training. It refines training progressively, tailors training to task requirements, and improves generalization through exposure to diverse examples. We present a curriculum learning…

Computation and Language · Computer Science 2023-11-23 Nidhi Vakil , Hadi Amiri

Large language models are increasingly trained on corpora containing both natural language and non-linguistic data like source code. Aside from aiding programming-related tasks, anecdotal evidence suggests that including code in pretraining…

Computation and Language · Computer Science 2025-02-26 Jackson Petty , Sjoerd van Steenkiste , Tal Linzen

Large language models have become extremely popular recently due to their ability to achieve strong performance on a variety of tasks, such as text generation and rewriting, but their size and computation cost make them difficult to access,…

Computation and Language · Computer Science 2026-01-08 Anthony Lamelas

The increasing demand for programming language education and growing class sizes require immediate and personalized feedback. However, traditional code review methods have limitations in providing this level of feedback. As the capabilities…

Software Engineering · Computer Science 2025-06-23 Lee Dong-Kyu

Curriculum learning (CL) aims to increase the performance of a learner on a given task by applying a specialized learning strategy. This strategy focuses on either the dataset, the task, or the model. There is little to no work analysing…

Machine Learning · Computer Science 2023-11-08 Luca Scharr , Vanessa Toborek

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred…

Computation and Language · Computer Science 2024-02-06 Dejiao Zhang , Wasi Ahmad , Ming Tan , Hantian Ding , Ramesh Nallapati , Dan Roth , Xiaofei Ma , Bing Xiang

Neural combinatorial optimization (NCO) aims at designing problem-independent and efficient neural network-based strategies for solving combinatorial problems. The field recently experienced growth by successfully adapting architectures…

Machine Learning · Computer Science 2020-11-13 Michal Lisicki , Arash Afkanpour , Graham W. Taylor

Training machine learning models in a meaningful order, from the easy samples to the hard ones, using curriculum learning can provide performance improvements over the standard training approach based on random data shuffling, without any…

Machine Learning · Computer Science 2022-04-12 Petru Soviany , Radu Tudor Ionescu , Paolo Rota , Nicu Sebe

We describe our team's contribution to the STRICT-SMALL track of the BabyLM Challenge. The challenge requires training a language model from scratch using only a relatively small training dataset of ten million words. We experiment with…

Computation and Language · Computer Science 2023-11-16 Richard Diehl Martinez , Zebulon Goriely , Hope McGovern , Christopher Davis , Andrew Caines , Paula Buttery , Lisa Beinborn
‹ Prev 1 2 3 10 Next ›