Related papers: Ludwig: a type-based declarative deep learning too…

Incubating Text Classifiers Following User Instruction with Nothing but LLM

In this paper, we aim to generate text classification data given arbitrary class definitions (i.e., user instruction), so one can train a small text classifier without any human annotation or raw corpus. Compared with pioneer attempts, our…

Computation and Language · Computer Science 2024-05-21 Letian Peng , Jingbo Shang

Test-Time Learning with an Evolving Library

We introduce EvoLib, a test-time learning framework that enables large language models to accumulate, reuse, and evolve knowledge across problem instances without parameter updates or external supervision. Instead of adapting model…

Machine Learning · Computer Science 2026-05-15 Weijia Xu , Alessandro Sordoni , Chandan Singh , Zelalem Gero , Michel Galley , Xingdi Yuan , Jianfeng Gao

ELUDE: Generating interpretable explanations via a decomposition into labelled and unlabelled features

Deep learning models have achieved remarkable success in different areas of machine learning over the past decade; however, the size and complexity of these models make them difficult to understand. In an effort to make them more…

Computer Vision and Pattern Recognition · Computer Science 2022-06-20 Vikram V. Ramaswamy , Sunnie S. Y. Kim , Nicole Meister , Ruth Fong , Olga Russakovsky

LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning

Instruction tuning has emerged as a critical paradigm for improving the capabilities and alignment of large language models (LLMs). However, existing iterative model-aware data selection methods incur significant computational overhead, as…

Machine Learning · Computer Science 2025-05-13 Xiaotian Lin , Yanlin Qi , Yizhang Zhu , Themis Palpanas , Chengliang Chai , Nan Tang , Yuyu Luo

Randomized LU decomposition: An Algorithm for Dictionaries Construction

In recent years, distinctive-dictionary construction has gained importance due to his usefulness in data processing. Usually, one or more dictionaries are constructed from a training data and then they are used to classify signals that did…

Computer Vision and Pattern Recognition · Computer Science 2018-01-30 Aviv Rotbart , Gil Shabat , Yaniv Shmueli , Amir Averbuch

LAUD: Integrating Large Language Models with Active Learning for Unlabeled Data

Large language models (LLMs) have shown a remarkable ability to generalize beyond their pre-training data, and fine-tuning LLMs can elevate performance to human-level and beyond. However, in real-world scenarios, lacking labeled data often…

Machine Learning · Computer Science 2025-11-19 Tzu-Hsuan Chou , Chun-Nan Chou

LADDER: Self-Improving LLMs Through Recursive Problem Decomposition

We introduce LADDER (Learning through Autonomous Difficulty-Driven Example Recursion), a framework which enables Large Language Models to autonomously improve their problem-solving capabilities through self-guided learning by recursively…

Machine Learning · Computer Science 2025-03-06 Toby Simonds , Akira Yoshiyama

A flexible, extensible software framework for model compression based on the LC algorithm

We propose a software framework based on the ideas of the Learning-Compression (LC) algorithm, that allows a user to compress a neural network or other machine learning model using different compression schemes with minimal effort.…

Machine Learning · Computer Science 2020-05-19 Yerlan Idelbayev , Miguel Á. Carreira-Perpiñán

ShapeLib: Designing a library of programmatic 3D shape abstractions with Large Language Models

We present ShapeLib, the first method that leverages the priors of LLMs to design libraries of programmatic 3D shape abstractions. Our system accepts two forms of design intent: text descriptions of functions to include in the library and a…

Computer Vision and Pattern Recognition · Computer Science 2025-06-23 R. Kenny Jones , Paul Guerrero , Niloy J. Mitra , Daniel Ritchie

Coded Deep Learning: Framework and Algorithm

The success of deep learning (DL) is often achieved with large models and high complexity during both training and post-training inferences, hindering training in resource-limited settings. To alleviate these issues, this paper introduces a…

Machine Learning · Computer Science 2025-01-20 En-hui Yang , Shayan Mohajer Hamidi

KDFlow: A User-Friendly and Efficient Knowledge Distillation Framework for Large Language Models

Knowledge distillation (KD) is an essential technique to compress large language models (LLMs) into smaller ones. However, despite the distinct roles of the student model and the teacher model in KD, most existing frameworks still use a…

Computation and Language · Computer Science 2026-03-25 Songming Zhang , Xue Zhang , Tong Zhang , Bojie Hu , Yufeng Chen , Jinan Xu

On Deep Unsupervised Active Learning

Unsupervised active learning has attracted increasing attention in recent years, where its goal is to select representative samples in an unsupervised setting for human annotating. Most existing works are based on shallow linear models by…

Machine Learning · Computer Science 2020-07-29 Changsheng Li , Handong Ma , Zhao Kang , Ye Yuan , Xiao-Yu Zhang , Guoren Wang

Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data

In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets. However, the large size and high computation demands of LLMs limit their practicality in many…

Artificial Intelligence · Computer Science 2025-04-01 Juanhui Li , Sreyashi Nag , Hui Liu , Xianfeng Tang , Sheikh Sarwar , Limeng Cui , Hansu Gu , Suhang Wang , Qi He , Jiliang Tang

Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer

Deep learning models have introduced various intelligent applications to edge devices, such as image classification, speech recognition, and augmented reality. There is an increasing need of training such models on the devices in order to…

Machine Learning · Computer Science 2022-01-27 Kaiqi Zhao , Yitao Chen , Ming Zhao

DeepCloud. The Application of a Data-driven, Generative Model in Design

Generative systems have a significant potential to synthesize innovative design alternatives. Still, most of the common systems that have been adopted in design require the designer to explicitly define the specifications of the procedures…

Machine Learning · Computer Science 2019-04-03 Ardavan Bidgoli , Pedro Veloso

ToolACE-DEV: Self-Improving Tool Learning via Decomposition and EVolution

The tool-using capability of large language models (LLMs) enables them to access up-to-date external information and handle complex tasks. Current approaches to enhancing this capability primarily rely on distilling advanced models by data…

Computation and Language · Computer Science 2025-05-13 Xu Huang , Weiwen Liu , Xingshan Zeng , Yuefeng Huang , Xinlong Hao , Yuxian Wang , Yirong Zeng , Chuhan Wu , Yasheng Wang , Ruiming Tang , Defu Lian

CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models

Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability and the instability of implicit reasoning, particularly when both planning and execution are involved. To…

Computation and Language · Computer Science 2024-06-24 Cheng Qian , Chi Han , Yi R. Fung , Yujia Qin , Zhiyuan Liu , Heng Ji

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and…

Machine Learning · Computer Science 2019-02-25 Jonathan Shen , Patrick Nguyen , Yonghui Wu , Zhifeng Chen , Mia X. Chen , Ye Jia , Anjuli Kannan , Tara Sainath , Yuan Cao , Chung-Cheng Chiu , Yanzhang He , Jan Chorowski , Smit Hinsu , Stella Laurenzo , James Qin , Orhan Firat , Wolfgang Macherey , Suyog Gupta , Ankur Bapna , Shuyuan Zhang , Ruoming Pang , Ron J. Weiss , Rohit Prabhavalkar , Qiao Liang , Benoit Jacob , Bowen Liang , HyoukJoong Lee , Ciprian Chelba , Sébastien Jean , Bo Li , Melvin Johnson , Rohan Anil , Rajat Tibrewal , Xiaobing Liu , Akiko Eriguchi , Navdeep Jaitly , Naveen Ari , Colin Cherry , Parisa Haghani , Otavio Good , Youlong Cheng , Raziel Alvarez , Isaac Caswell , Wei-Ning Hsu , Zongheng Yang , Kuan-Chieh Wang , Ekaterina Gonina , Katrin Tomanek , Ben Vanik , Zelin Wu , Llion Jones , Mike Schuster , Yanping Huang , Dehao Chen , Kazuki Irie , George Foster , John Richardson , Klaus Macherey , Antoine Bruguier , Heiga Zen , Colin Raffel , Shankar Kumar , Kanishka Rao , David Rybach , Matthew Murray , Vijayaditya Peddinti , Maxim Krikun , Michiel A. U. Bacchiani , Thomas B. Jablin , Rob Suderman , Ian Williams , Benjamin Lee , Deepti Bhatia , Justin Carlson , Semih Yavuz , Yu Zhang , Ian McGraw , Max Galkin , Qi Ge , Golan Pundak , Chad Whipkey , Todd Wang , Uri Alon , Dmitry Lepikhin , Ye Tian , Sara Sabour , William Chan , Shubham Toshniwal , Baohua Liao , Michael Nirschl , Pat Rondon

Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey

Deep learning technologies have demonstrated remarkable effectiveness in a wide range of tasks, and deep learning holds the potential to advance a multitude of applications, including in edge computing, where deep models are deployed on…

Machine Learning · Computer Science 2022-08-24 Dalin Zhang , Kaixuan Chen , Yan Zhao , Bin Yang , Lina Yao , Christian S. Jensen

DeepInnovator: Triggering the Innovative Capabilities of LLMs

The application of Large Language Models (LLMs) in accelerating scientific discovery has garnered increasing attention, with a key focus on constructing research agents endowed with innovative capability, i.e., the ability to autonomously…

Computation and Language · Computer Science 2026-02-24 Tianyu Fan , Fengji Zhang , Yuxiang Zheng , Bei Chen , Xinyao Niu , Chengen Huang , Junyang Lin , Chao Huang