English
Related papers

Related papers: Ludwig: a type-based declarative deep learning too…

200 papers

In this paper, we aim to generate text classification data given arbitrary class definitions (i.e., user instruction), so one can train a small text classifier without any human annotation or raw corpus. Compared with pioneer attempts, our…

Computation and Language · Computer Science 2024-05-21 Letian Peng , Jingbo Shang

We introduce EvoLib, a test-time learning framework that enables large language models to accumulate, reuse, and evolve knowledge across problem instances without parameter updates or external supervision. Instead of adapting model…

Machine Learning · Computer Science 2026-05-15 Weijia Xu , Alessandro Sordoni , Chandan Singh , Zelalem Gero , Michel Galley , Xingdi Yuan , Jianfeng Gao

Deep learning models have achieved remarkable success in different areas of machine learning over the past decade; however, the size and complexity of these models make them difficult to understand. In an effort to make them more…

Computer Vision and Pattern Recognition · Computer Science 2022-06-20 Vikram V. Ramaswamy , Sunnie S. Y. Kim , Nicole Meister , Ruth Fong , Olga Russakovsky

Instruction tuning has emerged as a critical paradigm for improving the capabilities and alignment of large language models (LLMs). However, existing iterative model-aware data selection methods incur significant computational overhead, as…

Machine Learning · Computer Science 2025-05-13 Xiaotian Lin , Yanlin Qi , Yizhang Zhu , Themis Palpanas , Chengliang Chai , Nan Tang , Yuyu Luo

In recent years, distinctive-dictionary construction has gained importance due to his usefulness in data processing. Usually, one or more dictionaries are constructed from a training data and then they are used to classify signals that did…

Computer Vision and Pattern Recognition · Computer Science 2018-01-30 Aviv Rotbart , Gil Shabat , Yaniv Shmueli , Amir Averbuch

Large language models (LLMs) have shown a remarkable ability to generalize beyond their pre-training data, and fine-tuning LLMs can elevate performance to human-level and beyond. However, in real-world scenarios, lacking labeled data often…

Machine Learning · Computer Science 2025-11-19 Tzu-Hsuan Chou , Chun-Nan Chou

We introduce LADDER (Learning through Autonomous Difficulty-Driven Example Recursion), a framework which enables Large Language Models to autonomously improve their problem-solving capabilities through self-guided learning by recursively…

Machine Learning · Computer Science 2025-03-06 Toby Simonds , Akira Yoshiyama

We propose a software framework based on the ideas of the Learning-Compression (LC) algorithm, that allows a user to compress a neural network or other machine learning model using different compression schemes with minimal effort.…

Machine Learning · Computer Science 2020-05-19 Yerlan Idelbayev , Miguel Á. Carreira-Perpiñán

We present ShapeLib, the first method that leverages the priors of LLMs to design libraries of programmatic 3D shape abstractions. Our system accepts two forms of design intent: text descriptions of functions to include in the library and a…

Computer Vision and Pattern Recognition · Computer Science 2025-06-23 R. Kenny Jones , Paul Guerrero , Niloy J. Mitra , Daniel Ritchie

The success of deep learning (DL) is often achieved with large models and high complexity during both training and post-training inferences, hindering training in resource-limited settings. To alleviate these issues, this paper introduces a…

Machine Learning · Computer Science 2025-01-20 En-hui Yang , Shayan Mohajer Hamidi

Knowledge distillation (KD) is an essential technique to compress large language models (LLMs) into smaller ones. However, despite the distinct roles of the student model and the teacher model in KD, most existing frameworks still use a…

Computation and Language · Computer Science 2026-03-25 Songming Zhang , Xue Zhang , Tong Zhang , Bojie Hu , Yufeng Chen , Jinan Xu

Unsupervised active learning has attracted increasing attention in recent years, where its goal is to select representative samples in an unsupervised setting for human annotating. Most existing works are based on shallow linear models by…

Machine Learning · Computer Science 2020-07-29 Changsheng Li , Handong Ma , Zhao Kang , Ye Yuan , Xiao-Yu Zhang , Guoren Wang

In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets. However, the large size and high computation demands of LLMs limit their practicality in many…

Artificial Intelligence · Computer Science 2025-04-01 Juanhui Li , Sreyashi Nag , Hui Liu , Xianfeng Tang , Sheikh Sarwar , Limeng Cui , Hansu Gu , Suhang Wang , Qi He , Jiliang Tang

Deep learning models have introduced various intelligent applications to edge devices, such as image classification, speech recognition, and augmented reality. There is an increasing need of training such models on the devices in order to…

Machine Learning · Computer Science 2022-01-27 Kaiqi Zhao , Yitao Chen , Ming Zhao

Generative systems have a significant potential to synthesize innovative design alternatives. Still, most of the common systems that have been adopted in design require the designer to explicitly define the specifications of the procedures…

Machine Learning · Computer Science 2019-04-03 Ardavan Bidgoli , Pedro Veloso

The tool-using capability of large language models (LLMs) enables them to access up-to-date external information and handle complex tasks. Current approaches to enhancing this capability primarily rely on distilling advanced models by data…

Computation and Language · Computer Science 2025-05-13 Xu Huang , Weiwen Liu , Xingshan Zeng , Yuefeng Huang , Xinlong Hao , Yuxian Wang , Yirong Zeng , Chuhan Wu , Yasheng Wang , Ruiming Tang , Defu Lian

Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability and the instability of implicit reasoning, particularly when both planning and execution are involved. To…

Computation and Language · Computer Science 2024-06-24 Cheng Qian , Chi Han , Yi R. Fung , Yujia Qin , Zhiyuan Liu , Heng Ji

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and…

Deep learning technologies have demonstrated remarkable effectiveness in a wide range of tasks, and deep learning holds the potential to advance a multitude of applications, including in edge computing, where deep models are deployed on…

Machine Learning · Computer Science 2022-08-24 Dalin Zhang , Kaixuan Chen , Yan Zhao , Bin Yang , Lina Yao , Christian S. Jensen

The application of Large Language Models (LLMs) in accelerating scientific discovery has garnered increasing attention, with a key focus on constructing research agents endowed with innovative capability, i.e., the ability to autonomously…

Computation and Language · Computer Science 2026-02-24 Tianyu Fan , Fengji Zhang , Yuxiang Zheng , Bei Chen , Xinyao Niu , Chengen Huang , Junyang Lin , Chao Huang
‹ Prev 1 2 3 10 Next ›