Related papers: EfficientMorph: Parameter-Efficient Transformer-Ba…

RefineFormer3D: Efficient 3D Medical Image Segmentation via Adaptive Multi-Scale Transformer with Cross Attention Fusion

Accurate and computationally efficient 3D medical image segmentation remains a critical challenge in clinical workflows. Transformer-based architectures often demonstrate superior global contextual modeling but at the expense of excessive…

Image and Video Processing · Electrical Eng. & Systems 2026-02-19 Kavyansh Tyagi , Vishwas Rathi , Puneet Goyal

UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration

Complicated image registration is a key issue in medical image analysis, and deep learning-based methods have achieved better results than traditional methods. The methods include ConvNet-based and Transformer-based methods. Although…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Runshi Zhang , Hao Mo , Junchen Wang , Bimeng Jie , Yang He , Nenghao Jin , Liang Zhu

SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation

The adoption of Vision Transformers (ViTs) based architectures represents a significant advancement in 3D Medical Image (MI) segmentation, surpassing traditional Convolutional Neural Network (CNN) models by enhancing global contextual…

Computer Vision and Pattern Recognition · Computer Science 2024-04-25 Shehan Perera , Pouyan Navard , Alper Yilmaz

Primus: Enforcing Attention Usage for 3D Medical Image Segmentation

Transformers have achieved remarkable success across multiple fields, yet their impact on 3D medical image segmentation remains limited with convolutional networks still dominating major benchmarks. In this work, (A) we analyze current…

Computer Vision and Pattern Recognition · Computer Science 2026-05-01 Tassilo Wald , Saikat Roy , Fabian Isensee , Constantin Ulrich , Sebastian Ziegler , Dasha Trofimova , Raphael Stock , Michael Baumgartner , Gregor Köhler , Klaus Maier-Hein

WaveFormer: A 3D Transformer with Wavelet-Driven Feature Representation for Efficient Medical Image Segmentation

Transformer-based architectures have advanced medical image analysis by effectively modeling long-range dependencies, yet they often struggle in 3D settings due to substantial memory overhead and insufficient capture of fine-grained local…

Computer Vision and Pattern Recognition · Computer Science 2025-04-02 Md Mahfuz Al Hasan , Mahdi Zaman , Abdul Jawad , Alberto Santamaria-Pang , Ho Hin Lee , Ivan Tarapov , Kyle See , Md Shah Imran , Antika Roy , Yaser Pourmohammadi Fallah , Navid Asadizanjani , Reza Forghani

TransMorph: Transformer for unsupervised medical image registration

In the last decade, convolutional neural networks (ConvNets) have been a major focus of research in medical image analysis. However, the performances of ConvNets may be limited by a lack of explicit consideration of the long-range spatial…

Image and Video Processing · Electrical Eng. & Systems 2022-10-18 Junyu Chen , Eric C. Frey , Yufan He , William P. Segars , Ye Li , Yong Du

A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark

Transformers have demonstrated remarkable performance in natural language processing and computer vision. However, existing vision Transformers struggle to learn from limited medical data and are unable to generalize on diverse medical…

Image and Video Processing · Electrical Eng. & Systems 2023-04-06 Yunhe Gao , Mu Zhou , Di Liu , Zhennan Yan , Shaoting Zhang , Dimitris N. Metaxas

NestedMorph: Enhancing Deformable Medical Image Registration with Nested Attention Mechanisms

Deformable image registration is crucial for aligning medical images in a nonlinear fashion across different modalities, allowing for precise spatial correspondence between varying anatomical structures. This paper presents NestedMorph, a…

Image and Video Processing · Electrical Eng. & Systems 2024-12-11 Gurucharan Marthi Krishna Kumar , Janine Mendola , Amir Shmuel

Dynamic Linear Transformer for 3D Biomedical Image Segmentation

Transformer-based neural networks have surpassed promising performance on many biomedical image segmentation tasks due to a better global information modeling from the self-attention mechanism. However, most methods are still designed for…

Computer Vision and Pattern Recognition · Computer Science 2023-02-02 Zheyuan Zhang , Ulas Bagci

SegResMamba: An Efficient Architecture for 3D Medical Image Segmentation

The Transformer architecture has opened a new paradigm in the domain of deep learning with its ability to model long-range dependencies and capture global context and has outpaced the traditional Convolution Neural Networks (CNNs) in many…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Badhan Kumar Das , Ajay Singh , Saahil Islam , Gengyan Zhao , Andreas Maier

DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition

While transformers have shown great potential on video recognition with their strong capability of capturing long-range dependencies, they often suffer high computational costs induced by the self-attention to the huge number of 3D tokens.…

Computer Vision and Pattern Recognition · Computer Science 2022-11-23 Yuxuan Liang , Pan Zhou , Roger Zimmermann , Shuicheng Yan

Unsupervised Echocardiography Registration through Patch-based MLPs and Transformers

Image registration is an essential but challenging task in medical image computing, especially for echocardiography, where the anatomical structures are relatively noisy compared to other imaging modalities. Traditional (non-learning)…

Computer Vision and Pattern Recognition · Computer Science 2022-11-22 Zihao Wang , Yingyu Yang , Maxime Sermesant , Herve Delingette

Compute-Efficient Medical Image Classification with Softmax-Free Transformers and Sequence Normalization

The Transformer model has been pivotal in advancing fields such as natural language processing, speech recognition, and computer vision. However, a critical limitation of this model is its quadratic computational and memory complexity…

Computer Vision and Pattern Recognition · Computer Science 2024-06-04 Firas Khader , Omar S. M. El Nahhas , Tianyu Han , Gustav Müller-Franzes , Sven Nebelung , Jakob Nikolas Kather , Daniel Truhn

Token-UNet: A New Case for Transformers Integration in Efficient and Interpretable 3D UNets for Brain Imaging Segmentation

We present Token-UNet, adopting the TokenLearner and TokenFuser modules to encase Transformers into UNets. While Transformers have enabled global interactions among input elements in medical imaging, current computational challenges hinder…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Louis Fabrice Tshimanga , Andrea Zanola , Federico Del Pup , Manfredo Atzori

A lightweight residual network for unsupervised deformable image registration

Accurate volumetric image registration is highly relevant for clinical routines and computer-aided medical diagnosis. Recently, researchers have begun to use transformers in learning-based methods for medical image registration, and have…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Ahsan Raza Siyal , Astrid Ellen Grams , Markus Haltmeier

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Transformers have become the predominant architecture in foundation models due to their excellent performance across various domains. However, the substantial cost of scaling these models remains a significant concern. This problem arises…

Machine Learning · Computer Science 2025-03-25 Haiyang Wang , Yue Fan , Muhammad Ferjad Naeem , Yongqin Xian , Jan Eric Lenssen , Liwei Wang , Federico Tombari , Bernt Schiele

TCSAFormer: Efficient Vision Transformer with Token Compression and Sparse Attention for Medical Image Segmentation

In recent years, transformer-based methods have achieved remarkable progress in medical image segmentation due to their superior ability to capture long-range dependencies. However, these methods typically suffer from two major limitations.…

Computer Vision and Pattern Recognition · Computer Science 2025-08-07 Zunhui Xia , Hongxing Li , Libin Lan

Video Mobile-Former: Video Recognition with Efficient Global Spatial-temporal Modeling

Transformer-based models have achieved top performance on major video recognition benchmarks. Benefiting from the self-attention mechanism, these models show stronger ability of modeling long-range dependencies compared to CNN-based models.…

Computer Vision and Pattern Recognition · Computer Science 2022-08-26 Rui Wang , Zuxuan Wu , Dongdong Chen , Yinpeng Chen , Xiyang Dai , Mengchen Liu , Luowei Zhou , Lu Yuan , Yu-Gang Jiang

ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism

Transformer-based models have emerged as a leading architecture for natural language processing, natural language generation, and image generation tasks. A fundamental element of the transformer architecture is self-attention, which allows…

Machine Learning · Computer Science 2025-07-01 Venmugil Elango

Multi-Objective Dual Simplex-Mesh Based Deformable Image Registration for 3D Medical Images -- Proof of Concept

Reliably and physically accurately transferring information between images through deformable image registration with large anatomical differences is an open challenge in medical image analysis. Most existing methods have two key…

Computer Vision and Pattern Recognition · Computer Science 2023-03-13 Georgios Andreadis , Peter A. N. Bosman , Tanja Alderliesten