Related papers: Deep Distilling: automated code generation using e…

Distilling Interpretable Models into Human-Readable Code

The goal of model distillation is to faithfully transfer teacher model knowledge to a model which is faster, more generalizable, more interpretable, or possesses other desirable characteristics. Human-readability is an important and…

Machine Learning · Computer Science 2021-02-10 Walker Ravina , Ethan Sterling , Olexiy Oryeshko , Nathan Bell , Honglei Zhuang , Xuanhui Wang , Yonghui Wu , Alexander Grushetsky

Distilling Model Knowledge

Top-performing machine learning systems, such as deep neural networks, large ensembles and complex probabilistic graphical models, can be expensive to store, slow to evaluate and hard to integrate into larger systems. Ideally, we would like…

Machine Learning · Statistics 2015-10-09 George Papamakarios

Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs

Distilling explicit chain-of-thought reasoning paths has emerged as an effective method for improving the reasoning abilities of large language models (LLMs) across various tasks. However, when tackling complex tasks that pose significant…

Computation and Language · Computer Science 2024-04-15 Jierui Li , Raymond Mooney

Reasoning Distillation and Structural Alignment for Improved Code Generation

Effective code generation with language models hinges on two critical factors: accurately understanding the intent of the prompt and generating code that applies algorithmic reasoning to produce correct solutions capable of passing diverse…

Artificial Intelligence · Computer Science 2025-10-21 Amir Jalilifard , Anderson de Rezende Rocha , Marcos Medeiros Raimundo

Improving the Interpretability of Deep Neural Networks with Knowledge Distillation

Deep Neural Networks have achieved huge success at a wide spectrum of applications from language modeling, computer vision to speech recognition. However, nowadays, good performance alone is not sufficient to satisfy the needs of practical…

Machine Learning · Computer Science 2018-12-31 Xuan Liu , Xiaoguang Wang , Stan Matwin

What is Dataset Distillation Learning?

Dataset distillation has emerged as a strategy to overcome the hurdles associated with large datasets by learning a compact set of synthetic data that retains essential information from the original dataset. While distilled data can be used…

Machine Learning · Computer Science 2024-07-23 William Yang , Ye Zhu , Zhiwei Deng , Olga Russakovsky

Towards a theory of model distillation

Distillation is the task of replacing a complicated machine learning model with a simpler model that approximates the original [BCNM06,HVD15]. Despite many practical applications, basic questions about the extent to which models can be…

Machine Learning · Computer Science 2024-05-07 Enric Boix-Adsera

PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning

While large language models (LLMs) excel in various natural language processing tasks, their huge size and the inaccessibility of parameters present challenges for practical deployment. Previous studies try to distill task-specific ability…

Computation and Language · Computer Science 2024-03-21 Xuekai Zhu , Biqing Qi , Kaiyan Zhang , Xinwei Long , Zhouhan Lin , Bowen Zhou

Explainable RL Policies by Distilling to Locally-Specialized Linear Policies with Voronoi State Partitioning

Deep Reinforcement Learning is one of the state-of-the-art methods for producing near-optimal system controllers. However, deep RL algorithms train a deep neural network, that lacks transparency, which poses challenges when the controller…

Machine Learning · Computer Science 2025-11-18 Senne Deproost , Dennis Steckelmacher , Ann Nowé

DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training

Although large language models (LLMs) have recently achieved remarkable performance on various complex reasoning benchmarks, the academic community still lacks an in-depth understanding of base model training processes and data quality. To…

Computation and Language · Computer Science 2025-05-14 Xiaoyu Tian , Sitong Zhao , Haotian Wang , Shuaiting Chen , Yiping Peng , Yunjie Ji , Han Zhao , Xiangang Li

Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks

Dataset distillation, a training-aware data compression technique, has recently attracted increasing attention as an effective tool for mitigating costs of optimization and data storage. However, progress remains largely empirical.…

Machine Learning · Computer Science 2026-03-31 Yuri Kinoshita , Naoki Nishikawa , Taro Toyoizumi

Searching for internal symbols underlying deep learning

Deep learning (DL) enables deep neural networks (DNNs) to automatically learn complex tasks or rules from given examples without instructions or guiding principles. As we do not engineer DNNs' functions, it is extremely difficult to…

Machine Learning · Computer Science 2024-11-19 Jung H. Lee , Sujith Vijayan

Data Distillation for Text Classification

Deep learning techniques have achieved great success in many fields, while at the same time deep learning models are getting more complex and expensive to compute. It severely hinders the wide applications of these models. In order to…

Computation and Language · Computer Science 2021-04-20 Yongqi Li , Wenjie Li

Harnessing Deep Neural Networks with Logic Rules

Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g.,…

Machine Learning · Computer Science 2020-08-11 Zhiting Hu , Xuezhe Ma , Zhengzhong Liu , Eduard Hovy , Eric Xing

The Valley of Code Reasoning: Scaling Knowledge Distillation of Large Language Models

Distilling the thinking traces of a Large Language Model (LLM) with reasoning capabilities into a smaller model has been proven effective. Yet, there is a scarcity of work done on how model performances scale with the quantity of…

Computation and Language · Computer Science 2025-10-08 Muyu He , Muhammad Ali Shafique , Anand Kumar , Tsach Mackey , Nazneen Rajani

Deep Explainable Learning with Graph Based Data Assessing and Rule Reasoning

Learning an explainable classifier often results in low accuracy model or ends up with a huge rule set, while learning a deep model is usually more capable of handling noisy data at scale, but with the cost of hard to explain the result and…

Artificial Intelligence · Computer Science 2022-11-11 Yuanlong Li , Gaopan Huang , Min Zhou , Chuan Fu , Honglin Qiao , Yan He

Learning to Generate Synthetic Training Data using Gradient Matching and Implicit Differentiation

Using huge training datasets can be costly and inconvenient. This article explores various data distillation techniques that can reduce the amount of data required to successfully train deep networks. Inspired by recent ideas, we suggest…

Machine Learning · Computer Science 2022-03-17 Dmitry Medvedev , Alexander D'yakonov

A Probabilistic Theory of Deep Learning

A grand challenge in machine learning is the development of computational algorithms that match or outperform humans in perceptual inference tasks that are complicated by nuisance variation. For instance, visual object recognition involves…

Machine Learning · Statistics 2015-04-03 Ankit B. Patel , Tan Nguyen , Richard G. Baraniuk

Knowledge Distillation of Convolutional Neural Networks through Feature Map Transformation using Decision Trees

The interpretation of reasoning by Deep Neural Networks (DNN) is still challenging due to their perceived black-box nature. Therefore, deploying DNNs in several real-world tasks is restricted by the lack of transparency of these models. We…

Computer Vision and Pattern Recognition · Computer Science 2024-03-12 Maddimsetti Srinivas , Debdoot Sheet

Learning with Differentiable Algorithms

Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path…

Machine Learning · Computer Science 2022-09-02 Felix Petersen