Related papers: Data-efficient Performance Modeling via Pre-traini…

A Deep Learning Based Cost Model for Automatic Code Optimization

Enabling compilers to automatically optimize code has been a longstanding goal for the compiler community. Efficiently solving this problem requires using precise cost models. These models predict whether applying a sequence of code…

Programming Languages · Computer Science 2021-04-13 Riyadh Baghdadi , Massinissa Merouani , Mohamed-Hicham Leghettas , Kamel Abdous , Taha Arbaoui , Karima Benatchba , Saman Amarasinghe

A Learned Performance Model for Tensor Processing Units

Accurate hardware performance models are critical to efficient code generation. They can be used by compilers to make heuristic decisions, by superoptimizers as a minimization objective, or by autotuners to find an optimal configuration for…

Performance · Computer Science 2021-03-19 Samuel J. Kaufman , Phitchaya Mangpo Phothilimthana , Yanqi Zhou , Charith Mendis , Sudip Roy , Amit Sabne , Mike Burrows

Meta-learning autoencoders for few-shot prediction

Compared to humans, machine learning models generally require significantly more training examples and fail to extrapolate from experience to solve previously unseen challenges. To help close this performance gap, we augment single-task…

Machine Learning · Computer Science 2018-07-27 Tailin Wu , John Peurifoy , Isaac L. Chuang , Max Tegmark

Predictions For Pre-training Language Models

Language model pre-training has proven to be useful in many language understanding tasks. In this paper, we investigate whether it is still helpful to add the self-training method in the pre-training step and the fine-tuning step. Towards…

Computation and Language · Computer Science 2023-02-17 Tong Guo

Comprehensive and Efficient Data Labeling via Adaptive Model Scheduling

Labeling data (e.g., labeling the people, objects, actions and scene in images) comprehensively and efficiently is a widely needed but challenging task. Numerous models were proposed to label various data and many approaches were designed…

Machine Learning · Computer Science 2020-02-14 Mu Yuan , Lan Zhang , Xiang-Yang Li , Hui Xiong

MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations

One of the primary areas of interest in High Performance Computing is the improvement of performance of parallel workloads. Nowadays, compilable source code-based optimization tasks that employ deep learning often exploit LLVM Intermediate…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-03 Akash Dutta , Ali Jannesari

MedMAE: A Self-Supervised Backbone for Medical Imaging Tasks

Medical imaging tasks are very challenging due to the lack of publicly available labeled datasets. Hence, it is difficult to achieve high performance with existing deep-learning models as they require a massive labeled dataset to be trained…

Image and Video Processing · Electrical Eng. & Systems 2024-07-23 Anubhav Gupta , Islam Osman , Mohamed S. Shehata , John W. Braun

MTSMAE: Masked Autoencoders for Multivariate Time-Series Forecasting

Large-scale self-supervised pre-training Transformer architecture have significantly boosted the performance for various tasks in natural language processing (NLP) and computer vision (CV). However, there is a lack of researches on…

Machine Learning · Computer Science 2022-10-06 Peiwang Tang , Xianchao Zhang

Making the most of small Software Engineering datasets with modern machine learning

This paper provides a starting point for Software Engineering (SE) researchers and practitioners faced with the problem of training machine learning models on small datasets. Due to the high costs associated with labeling data, in Software…

Software Engineering · Computer Science 2021-06-30 Julian Aron Prenner , Romain Robbes

On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

In autonomous embedded systems, it is often vital to reduce the amount of actions taken in the real world and energy required to learn a policy. Training reinforcement learning agents from high dimensional image representations can be very…

Machine Learning · Computer Science 2019-03-26 Bharat Prakash , Mark Horton , Nicholas R. Waytowich , William David Hairston , Tim Oates , Tinoosh Mohsenin

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Recent years have witnessed the promise of coupling machine learning methods and physical domain-specific insights for solving scientific problems based on partial differential equations (PDEs). However, being data-intensive, these methods…

Machine Learning · Computer Science 2025-06-03 Wuyang Chen , Jialin Song , Pu Ren , Shashank Subramanian , Dmitriy Morozov , Michael W. Mahoney

Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction

Training deep learning models can be computationally expensive. Prior works have shown that increasing the batch size can potentially lead to better overall throughput. However, the batch size is frequently limited by the accelerator memory…

Machine Learning · Computer Science 2023-01-25 Muralidhar Andoorveedu , Zhanda Zhu , Bojian Zheng , Gennady Pekhimenko

Improving Code Autocompletion with Transfer Learning

Software language models have achieved promising results predicting code completion usages, and several industry studies have described successful IDE integrations. Recently, accuracy in autocompletion prediction improved 12.8% from…

Software Engineering · Computer Science 2021-10-14 Wen Zhou , Seohyun Kim , Vijayaraghavan Murali , Gareth Ari Aye

Self-Supervised Pre-training with Combined Datasets for 3D Perception in Autonomous Driving

The significant achievements of pre-trained models leveraging large volumes of data in the field of NLP and 2D vision inspire us to explore the potential of extensive data pre-training for 3D perception in autonomous driving. Toward this…

Computer Vision and Pattern Recognition · Computer Science 2025-04-18 Shumin Wang , Zhuoran Yang , Lidian Wang , Zhipeng Tang , Heng Li , Lehan Pan , Sha Zhang , Jie Peng , Jianmin Ji , Yanyong Zhang

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Pre-training video transformers on extra large-scale datasets is generally required to achieve premier performance on relatively small datasets. In this paper, we show that video masked autoencoders (VideoMAE) are data-efficient learners…

Computer Vision and Pattern Recognition · Computer Science 2022-10-19 Zhan Tong , Yibing Song , Jue Wang , Limin Wang

On the Value of Tokeniser Pretraining in Physics Foundation Models

We investigate the impact of tokeniser pretraining on the accuracy and efficiency of physics emulation. Modern high-resolution simulations produce vast volumes of data spanning diverse physical regimes and scales. Training foundation models…

Machine Learning · Computer Science 2026-03-13 Hadi Sotoudeh , Payel Mukhopadhyay , Ruben Ohana , Michael McCabe , Neil D. Lawrence , Shirley Ho , Miles Cranmer

On Effective Scheduling of Model-based Reinforcement Learning

Model-based reinforcement learning has attracted wide attention due to its superior sample efficiency. Despite its impressive success so far, it is still unclear how to appropriately schedule the important hyperparameters to achieve…

Machine Learning · Computer Science 2022-07-06 Hang Lai , Jian Shen , Weinan Zhang , Yimin Huang , Xing Zhang , Ruiming Tang , Yong Yu , Zhenguo Li

Rethinking Pre-Training in Tabular Data: A Neighborhood Embedding Perspective

Pre-training is prevalent in deep learning for vision and text data, leveraging knowledge from other datasets to enhance downstream tasks. However, for tabular data, the inherent heterogeneity in attribute and label spaces across datasets…

Machine Learning · Computer Science 2025-02-13 Han-Jia Ye , Qi-Le Zhou , Huai-Hong Yin , De-Chuan Zhan , Wei-Lun Chao

Practical tradeoffs between memory, compute, and performance in learned optimizers

Optimization plays a costly and crucial role in developing machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric…

Machine Learning · Computer Science 2022-07-19 Luke Metz , C. Daniel Freeman , James Harrison , Niru Maheswaranathan , Jascha Sohl-Dickstein

Better Language Models of Code through Self-Improvement

Pre-trained language models for code (PLMCs) have gained attention in recent research. These models are pre-trained on large-scale datasets using multi-modal objectives. However, fine-tuning them requires extensive supervision and is…

Computation and Language · Computer Science 2023-05-11 Hung Quoc To , Nghi D. Q. Bui , Jin Guo , Tien N. Nguyen