English
Related papers

Related papers: Vision Tiny Recursion Model (ViTRM): Parameter-Eff…

200 papers

Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) are two dominant models for image analysis. While CNNs excel at extracting multi-scale features and ViTs effectively capture global dependencies, both suffer from high…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Shicheng Yin , Kaixuan Yin , Weixing Chen , Enbo Huang , Yang Liu

Recursive architectures such as Tiny Recursive Models (TRMs) perform implicit reasoning through iterative latent computation, yet the geometric structure of these reasoning trajectories remains poorly understood. We investigate the…

Machine Learning · Computer Science 2026-04-21 Ege Çakar , Ketan Ali Raghu , Lia Zheng

Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on hard puzzle tasks such as Sudoku, Maze, and…

Machine Learning · Computer Science 2025-10-07 Alexia Jolicoeur-Martineau

Although convolutional neural networks (CNNs) showed remarkable results in many vision tasks, they are still strained by simple yet challenging visual reasoning problems. Inspired by the recent success of the Transformer network in computer…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Nicola Messina , Giuseppe Amato , Fabio Carrara , Claudio Gennaro , Fabrizio Falchi

The computational overhead of Vision Transformers in practice stems fundamentally from their deep architectures, yet existing acceleration strategies have primarily targeted algorithmic-level optimizations such as token pruning and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Chengwei Zhou , Vipin Chaudhary , Gourav Datta

Neural network controllers increasingly demand millions of parameters, and language model approaches push into the billions. For embedded aerospace systems with strict power and latency constraints, this scaling is prohibitive. We present…

Machine Learning · Computer Science 2025-12-19 Amit Jain , Richard Linares

Tiny Recursive Models (TRM) were proposed as a parameter-efficient alternative to large language models for solving Abstraction and Reasoning Corpus (ARC) style tasks. The original work reports strong performance and suggests that recursive…

Machine Learning · Computer Science 2026-01-12 Antonio Roye-Azar , Santiago Vargas-Naranjo , Dhruv Ghai , Nithin Balamurugan , Rayan Amir

Over parameterization is a common technique in deep learning to help models learn and generalize sufficiently to the given task; nonetheless, this often leads to enormous network structures and consumes considerable computing resources…

Computer Vision and Pattern Recognition · Computer Science 2022-04-26 Yuanchu Liang , Saeed Anwar , Yang Liu

Vision-transformers (ViTs) and large-scale convolution-neural-networks (CNNs) have reshaped computer vision through pretrained feature representations that enable strong transfer learning for diverse tasks. However, their efficiency as…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Alon Kaya , Igal Bilik , Inna Stainvas

Tiny Recursive Models (TRM) solve complex reasoning tasks with a fraction of the parameters of modern large language models (LLMs) by iteratively refining a latent state and final answer. While powerful, their deterministic recursion can…

Artificial Intelligence · Computer Science 2026-05-20 Amin Sghaier , Ali Parviz , Alexia Jolicoeur-Martineau

Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. While ViTs generally outperform CNNs by effectively capturing long-range dependencies and input-specific…

Computer Vision and Pattern Recognition · Computer Science 2025-06-16 Lingshun Kong , Jiangxin Dong , Jinhui Tang , Ming-Hsuan Yang , Jinshan Pan

The most recent year has witnessed the success of applying the Vision Transformer (ViT) for image classification. However, there are still evidences indicating that ViT often suffers following two aspects, i) the high computation and the…

Computer Vision and Pattern Recognition · Computer Science 2021-12-30 Xian Wei , Bin Wang , Mingsong Chen , Ji Yuan , Hai Lan , Jiehuang Shi , Xuan Tang , Bo Jin , Guozhang Chen , Dongping Yang

We present Large Inverse Rendering Model (LIRM), a transformer architecture that jointly reconstructs high-quality shape, materials, and radiance fields with view-dependent effects in less than a second. Our model builds upon the recent…

Computer Vision and Pattern Recognition · Computer Science 2025-04-29 Zhengqin Li , Dilin Wang , Ka Chen , Zhaoyang Lv , Thu Nguyen-Phuoc , Milim Lee , Jia-Bin Huang , Lei Xiao , Cheng Zhang , Yufeng Zhu , Carl S. Marshall , Yufeng Ren , Richard Newcombe , Zhao Dong

Tiny Recursive Models (TRMs) have recently demonstrated remarkable performance on ARC-AGI, showing that very small models can compete against large foundation models through a two-step refinement mechanism that updates an internal reasoning…

Machine Learning · Computer Science 2026-03-10 Paulius Rauba , Claudio Fanconi , Mihaela van der Schaar

In recent computer vision research, the advent of the Vision Transformer (ViT) has rapidly revolutionized various architectural design efforts: ViT achieved state-of-the-art image classification performance using self-attention found in…

Computer Vision and Pattern Recognition · Computer Science 2023-01-13 Yuki Tatsunami , Masato Taki

This study evaluates the trade-offs between convolutional and transformer-based architectures on both medical and general-purpose image classification benchmarks. We use ResNet-18 as our baseline and introduce a fine-tuning strategy applied…

Computer Vision and Pattern Recognition · Computer Science 2026-02-16 Aidar Amangeldi , Angsar Taigonyrov , Muhammad Huzaifa Jawad , Chinedu Emmanuel Mbonu

Neural reasoners such as Tiny Recursive Models (TRMs) solve complex problems by combining neural backbones with specialized inference schemes. Such inference schemes have been a central component of stochastic reasoning systems, where…

Machine Learning · Computer Science 2026-03-06 Mieszko Komisarczyk , Saurabh Mathur , Maurice Kraus , Sriraam Natarajan , Kristian Kersting

Vision Transformers (ViTs) have demonstrated remarkable success on large-scale datasets, but their performance on smaller datasets often falls short of convolutional neural networks (CNNs). This paper explores the design and optimization of…

Machine Learning · Computer Science 2025-01-14 Gent Wu

Vision transformers have recently made a breakthrough in computer vision showing excellent performance in terms of precision for numerous applications. However, their computational cost is very high compared to alternative approaches such…

Computer Vision and Pattern Recognition · Computer Science 2026-03-02 Martial Guidez , Stefan Duffner , Christophe Garcia

This paper proposes a working recipe of using Vision Transformer (ViT) in class incremental learning. Although this recipe only combines existing techniques, developing the combination is not trivial. Firstly, naive application of ViT to…

Computer Vision and Pattern Recognition · Computer Science 2022-04-19 Pei Yu , Yinpeng Chen , Ying Jin , Zicheng Liu
‹ Prev 1 2 3 10 Next ›