Related papers: SMART: Self-supervised Multi-task pretrAining with…

SiT: Self-supervised vIsion Transformer

Self-supervised learning methods are gaining increasing traction in computer vision due to their recent success in reducing the gap with supervised learning. In natural language processing (NLP) self-supervised learning and transformers are…

Computer Vision and Pattern Recognition · Computer Science 2022-12-29 Sara Atito , Muhammad Awais , Josef Kittler

Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges

Decision-making is a dynamic process requiring perception, memory, and reasoning to make choices and find optimal policies. Traditional approaches to decision-making suffer from sample efficiency and generalization, while large-scale…

Machine Learning · Computer Science 2024-01-08 Xiaoqian Liu , Jianbin Jiao , Junge Zhang

SMART: A Spectral Transfer Approach to Multi-Task Learning

Multi-task learning is effective for related applications, but its performance can deteriorate when the target sample size is small. Transfer learning can borrow strength from related studies; yet, many existing methods rely on restrictive…

Machine Learning · Computer Science 2026-04-23 Boxin Zhao , Mladen Kolar , Jinchi Lv

LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization

We consider small-data, large-scale decision problems in which a firm must make many operational decisions simultaneously (e.g., across a large product portfolio) while observing only a few, potentially noisy, data points per instance.…

Machine Learning · Computer Science 2026-02-04 Zishi Zhang , Jinhui Han , Ming Hu , Yijie Peng

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

Intelligent agents should have the ability to leverage knowledge from previously learned tasks in order to learn new ones quickly and efficiently. Meta-learning approaches have emerged as a popular solution to achieve this. However,…

Machine Learning · Computer Science 2023-02-17 Zhao Mandi , Pieter Abbeel , Stephen James

SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training

In recent years, continual learning with pre-training (CLPT) has received widespread interest, instead of its traditional focus of training from scratch. The use of strong pre-trained models (PTMs) can greatly facilitate knowledge transfer…

Computer Vision and Pattern Recognition · Computer Science 2024-08-16 Gengwei Zhang , Liyuan Wang , Guoliang Kang , Ling Chen , Yunchao Wei

A Meta-Reinforcement Learning Approach to Process Control

Meta-learning is a branch of machine learning which aims to quickly adapt models, such as neural networks, to perform new tasks by learning an underlying structure across related tasks. In essence, models are being trained to learn new…

Machine Learning · Computer Science 2021-11-16 Daniel G. McClement , Nathan P. Lawrence , Philip D. Loewen , Michael G. Forbes , Johan U. Backström , R. Bhushan Gopaluni

SMART: Submodular Data Mixture Strategy for Instruction Tuning

Instruction Tuning involves finetuning a language model on a collection of instruction-formatted datasets in order to enhance the generalizability of the model to unseen tasks. Studies have shown the importance of balancing different task…

Computation and Language · Computer Science 2024-07-16 H S V N S Kowndinya Renduchintala , Sumit Bhatia , Ganesh Ramakrishnan

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Transfer learning has fundamentally changed the landscape of natural language processing (NLP) research. Many existing state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However,…

Computation and Language · Computer Science 2021-09-10 Haoming Jiang , Pengcheng He , Weizhu Chen , Xiaodong Liu , Jianfeng Gao , Tuo Zhao

SMART: Self-learning Meta-strategy Agent for Reasoning Tasks

Tasks requiring deductive reasoning, especially those involving multiple steps, often demand adaptive strategies such as intermediate generation of rationales or programs, as no single approach is universally optimal. While Language Models…

Artificial Intelligence · Computer Science 2024-10-22 Rongxing Liu , Kumar Shridhar , Manish Prajapat , Patrick Xia , Mrinmaya Sachan

Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving

Aiming towards a holistic understanding of multiple downstream tasks simultaneously, there is a need for extracting features with better transferability. Though many latest self-supervised pre-training methods have achieved impressive…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 Xiwen Liang , Yangxin Wu , Jianhua Han , Hang Xu , Chunjing Xu , Xiaodan Liang

Self-Distillation for Further Pre-training of Transformers

Pre-training a large transformer model on a massive amount of unlabeled data and fine-tuning it on labeled datasets for diverse downstream tasks has proven to be a successful strategy, for a variety of vision and natural language processing…

Computer Vision and Pattern Recognition · Computer Science 2023-06-12 Seanie Lee , Minki Kang , Juho Lee , Sung Ju Hwang , Kenji Kawaguchi

Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning

Speech representation learning plays a vital role in speech processing. Among them, self-supervised learning (SSL) has become an important research direction. It has been shown that an SSL pretraining model can achieve excellent performance…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-20 Yi-Chen Chen , Shu-wen Yang , Cheng-Kuang Lee , Simon See , Hung-yi Lee

Meta-learning for downstream aware and agnostic pretraining

Neural network pretraining is gaining attention due to its outstanding performance in natural language processing applications. However, pretraining usually leverages predefined task sequences to learn general linguistic clues. The lack of…

Computation and Language · Computer Science 2021-06-08 Hongyin Luo , Shuyan Dong , Yung-Sung Chuang , Shang-Wen Li

Self-Supervised Contrastive Pre-Training for Multivariate Point Processes

Self-supervision is one of the hallmarks of representation learning in the increasingly popular suite of foundation models including large language models such as BERT and GPT-3, but it has not been pursued in the context of multivariate…

Machine Learning · Computer Science 2024-02-05 Xiao Shou , Dharmashankar Subramanian , Debarun Bhattacharjya , Tian Gao , Kristin P. Bennet

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning

Despite the success of fully-supervised human skeleton sequence modeling, utilizing self-supervised pre-training for skeleton sequence representation learning has been an active field because acquiring task-specific skeleton annotations at…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Yuxiao Chen , Long Zhao , Jianbo Yuan , Yu Tian , Zhaoyang Xia , Shijie Geng , Ligong Han , Dimitris N. Metaxas

Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain

Currently, under supervised learning, a model pretrained by a large-scale nature scene dataset and then fine-tuned on a few specific task labeling data is the paradigm that has dominated the knowledge transfer learning. It has reached the…

Computer Vision and Pattern Recognition · Computer Science 2022-09-15 Tong Zhang , Peng Gao , Hao Dong , Yin Zhuang , Guanqun Wang , Wei Zhang , He Chen

ConBaT: Control Barrier Transformer for Safe Policy Learning

Large-scale self-supervised models have recently revolutionized our ability to perform a variety of tasks within the vision and language domains. However, using such models for autonomous systems is challenging because of safety…

Robotics · Computer Science 2023-03-09 Yue Meng , Sai Vemprala , Rogerio Bonatti , Chuchu Fan , Ashish Kapoor

SEPT: Towards Scalable and Efficient Visual Pre-Training

Recently, the self-supervised pre-training paradigm has shown great potential in leveraging large-scale unlabeled data to improve downstream task performance. However, increasing the scale of unlabeled pre-training data in real-world…

Computer Vision and Pattern Recognition · Computer Science 2022-12-13 Yiqi Lin , Huabin Zheng , Huaping Zhong , Jinjing Zhu , Weijia Li , Conghui He , Lin Wang

CTA: Cross-Task Alignment for Better Test Time Training

Deep learning models have demonstrated exceptional performance across a wide range of computer vision tasks. However, their performance often degrades significantly when faced with distribution shifts, such as domain or dataset changes.…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Samuel Barbeau , Pedram Fekri , David Osowiechi , Ali Bahri , Moslem Yazdanpanah , Masih Aminbeidokhti , Christian Desrosiers