Related papers: Adaptive Computation with Elastic Input Sequence

AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training

Highly distributed training of Deep Neural Networks (DNNs) on future compute platforms (offering 100 of TeraOps/s of computational capacity) is expected to be severely communication constrained. To overcome this limitation, new gradient…

Machine Learning · Computer Science 2017-12-08 Chia-Yu Chen , Jungwook Choi , Daniel Brand , Ankur Agrawal , Wei Zhang , Kailash Gopalakrishnan

Adaptive Computation Time for Recurrent Neural Networks

This paper introduces Adaptive Computation Time (ACT), an algorithm that allows recurrent neural networks to learn how many computational steps to take between receiving an input and emitting an output. ACT requires minimal changes to the…

Neural and Evolutionary Computing · Computer Science 2017-02-22 Alex Graves

Differentiable Adaptive Computation Time for Visual Reasoning

This paper presents a novel attention-based algorithm for achieving adaptive computation called DACT, which, unlike existing ones, is end-to-end differentiable. Our method can be used in conjunction with many networks; in particular, we…

Artificial Intelligence · Computer Science 2020-05-25 Cristobal Eyzaguirre , Alvaro Soto

Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space

We propose AdapTok, an adaptive temporal causal video tokenizer that can flexibly allocate tokens for different frames based on video content. AdapTok is equipped with a block-wise masking strategy that randomly drops tail tokens of each…

Computer Vision and Pattern Recognition · Computer Science 2025-10-15 Yan Li , Changyao Tian , Renqiu Xia , Ning Liao , Weiwei Guo , Junchi Yan , Hongsheng Li , Jifeng Dai , Hao Li , Xue Yang

AdaCap: An Adaptive Contrastive Approach for Small-Data Neural Networks

Neural networks struggle on small tabular datasets, where tree-based models remain dominant. We introduce Adaptive Contrastive Approach (AdaCap), a training scheme that combines a permutation-based contrastive loss with a Tikhonov-based…

Machine Learning · Computer Science 2025-11-26 Bruno Belucci , Karim Lounici , Katia Meziani

Understanding Dynamic Compute Allocation in Recurrent Transformers

Token-level adaptive computation seeks to reduce inference cost by allocating more computation to harder tokens and less to easier ones. However, prior work is primarily evaluated on natural-language benchmarks using task-level metrics,…

Computation and Language · Computer Science 2026-02-10 Ibraheem Muhammad Moosa , Suhas Lohit , Ye Wang , Moitreya Chatterjee , Wenpeng Yin

Learning to Adaptively Scale Recurrent Neural Networks

Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series. Currently, most of multiscale RNNs use fixed scales,…

Machine Learning · Computer Science 2019-02-18 Hao Hu , Liqiang Wang , Guo-Jun Qi

AdaTIR: Adaptive Tool-Integrated Reasoning via Difficulty-Aware Policy Optimization

Tool-Integrated Reasoning (TIR) has significantly enhanced the capabilities of Large Language Models (LLMs), yet current agents tend to exhibit cognitive offloading, redundantly invoking external tools even for simple tasks. In this paper,…

Computation and Language · Computer Science 2026-01-22 Zhaiyu Fang , Ruipeng Sun

Layer Flexible Adaptive Computational Time

Deep recurrent neural networks perform well on sequence data and are the model of choice. However, it is a daunting task to decide the structure of the networks, i.e. the number of layers, especially considering different computational…

Machine Learning · Computer Science 2021-01-05 Lida Zhang , Abdolghani Ebrahimi , Diego Klabjan

RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos

Procedure Planning in instructional videos entails generating a sequence of action steps based on visual observations of the initial and target states. Despite the rapid progress in this task, there remain several critical challenges to be…

Computer Vision and Pattern Recognition · Computer Science 2024-09-26 Ali Zare , Yulei Niu , Hammad Ayyubi , Shih-fu Chang

Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference

While transformer models have been highly successful, they are computationally inefficient. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective"…

Machine Learning · Computer Science 2024-12-19 Bartosz Wójcik , Alessio Devoto , Karol Pustelnik , Pasquale Minervini , Simone Scardapane

Adaptive Speech Emotion Representation Learning Based On Dynamic Graph

Graph representation learning has become a hot research topic due to its powerful nonlinear fitting capability in extracting representative node embeddings. However, for sequential data such as speech signals, most traditional methods…

Sound · Computer Science 2024-05-08 Yingxue Gao , Huan Zhao , Zixing Zhang

Training Adaptive Computation for Open-Domain Question Answering with Computational Constraints

Adaptive Computation (AC) has been shown to be effective in improving the efficiency of Open-Domain Question Answering (ODQA) systems. However, current AC approaches require tuning of all model parameters, and training state-of-the-art ODQA…

Computation and Language · Computer Science 2021-07-06 Yuxiang Wu , Pasquale Minervini , Pontus Stenetorp , Sebastian Riedel

AdaScale: Dynamic Context-aware DNN Scaling via Automated Adaptation Loop on Mobile Devices

Deep learning is reshaping mobile applications, with a growing trend of deploying deep neural networks (DNNs) directly to mobile and embedded devices to address real-time performance and privacy. To accommodate local resource limitations,…

Artificial Intelligence · Computer Science 2024-12-03 Yuzhan Wang , Sicong Liu , Bin Guo , Boqi Zhang , Ke Ma , Yasan Ding , Hao Luo , Yao Li , Zhiwen Yu

Deep Human Parsing with Active Template Regression

In this work, the human parsing task, namely decomposing a human image into semantic fashion/body regions, is formulated as an Active Template Regression (ATR) problem, where the normalized mask of each fashion/body item is expressed as the…

Computer Vision and Pattern Recognition · Computer Science 2016-11-15 Xiaodan Liang , Si Liu , Xiaohui Shen , Jianchao Yang , Luoqi Liu , Jian Dong , Liang Lin , Shuicheng Yan

Recurrent Attentive Neural Process for Sequential Data

Neural processes (NPs) learn stochastic processes and predict the distribution of target output adaptively conditioned on a context set of observed input-output pairs. Furthermore, Attentive Neural Process (ANP) improved the prediction…

Machine Learning · Computer Science 2019-10-22 Shenghao Qin , Jiacheng Zhu , Jimmy Qin , Wenshuo Wang , Ding Zhao

AdapLeR: Speeding up Inference by Adaptive Length Reduction

Pre-trained language models have shown stellar performance in various downstream tasks. But, this usually comes at the cost of high latency and computation, hindering their usage in resource-limited settings. In this work, we propose a…

Computation and Language · Computer Science 2022-03-18 Ali Modarressi , Hosein Mohebbi , Mohammad Taher Pilehvar

Adaptive Neural Compilation

This paper proposes an adaptive neural-compilation framework to address the problem of efficient program learning. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that…

Artificial Intelligence · Computer Science 2016-05-27 Rudy Bunel , Alban Desmaison , Pushmeet Kohli , Philip H. S. Torr , M. Pawan Kumar

An Adaptive Method Stabilizing Activations for Enhanced Generalization

We introduce AdaAct, a novel optimization algorithm that adjusts learning rates according to activation variance. Our method enhances the stability of neuron outputs by incorporating neuron-wise adaptivity during the training process, which…

Machine Learning · Computer Science 2025-06-11 Hyunseok Seung , Jaewoo Lee , Hyunsuk Ko

TAPS : Frustratingly Simple Test Time Active Learning for VLMs

Test-Time Optimization enables models to adapt to new data during inference by updating parameters on-the-fly. Recent advances in Vision-Language Models (VLMs) have explored learning prompts at test time to improve performance in downstream…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Dhruv Sarkar , Aprameyo Chakrabartty , Bibhudatta Bhanja