English

Convolutional Transformer-Based Image Compression

Image and Video Processing 2024-09-09 v1

Abstract

In this paper, we present a novel transformer-based architecture for end-to-end image compression. Our architecture incorporates blocks that effectively capture local dependencies between tokens, eliminating the need for positional encoding by integrating convolutional operations within the multi-head attention mechanism. We demonstrate through experiments that our proposed framework surpasses state-of-the-art CNN-based architectures in terms of the trade-off between bit-rate and distortion and achieves comparable results to transformer-based methods while maintaining lower computational complexity.

Keywords

Cite

@article{arxiv.2409.04118,
  title  = {Convolutional Transformer-Based Image Compression},
  author = {Bouzid Arezki and Fangchen Feng and Anissa Mokraoui},
  journal= {arXiv preprint arXiv:2409.04118},
  year   = {2024}
}

Comments

Published in: IEEE Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) 2023 Poznan, Poland

R2 v1 2026-06-28T18:36:15.310Z