Convolutional Transformer-Based Image Compression

Bouzid Arezki; Fangchen Feng; Anissa Mokraoui

doi:10.23919/SPA59660.2023.10274433

Convolutional Transformer-Based Image Compression

Image and Video Processing 2024-09-09 v1

Authors: Bouzid Arezki , Fangchen Feng , Anissa Mokraoui

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

In this paper, we present a novel transformer-based architecture for end-to-end image compression. Our architecture incorporates blocks that effectively capture local dependencies between tokens, eliminating the need for positional encoding by integrating convolutional operations within the multi-head attention mechanism. We demonstrate through experiments that our proposed framework surpasses state-of-the-art CNN-based architectures in terms of the trade-off between bit-rate and distortion and achieves comparable results to transformer-based methods while maintaining lower computational complexity.

Convolutional Transformer-Based Image Compression

Abstract

Keywords

Cite

Comments

Related papers