English
Related papers

Related papers: Multi-Modality Deep Network for Extreme Learned Im…

200 papers

The explosion of data has resulted in more and more associated text being transmitted along with images. Inspired by from distributed source coding, many works utilize image side information to enhance image compression. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Shiyu Qin , Bin Chen , Yujun Huang , Baoyi An , Tao Dai , Shu-Tao Xia

In recent years, many convolutional neural network-based models are designed for JPEG artifacts reduction, and have achieved notable progress. However, few methods are suitable for extreme low-bitrate image compression artifacts reduction.…

Computer Vision and Pattern Recognition · Computer Science 2023-05-05 Xuhao Jiang , Weimin Tan , Qing Lin , Chenxi Ma , Bo Yan , Liquan Shen

Cross-modal retrieval between visual data and natural language description remains a long-standing challenge in multimedia. While recent image-text retrieval methods offer great promise by learning deep representations aligned across…

Recent advances in text-guided image compression have shown great potential to enhance the perceptual quality of reconstructed images. These methods, however, tend to have significantly degraded pixel-wise fidelity, limiting their…

Computer Vision and Pattern Recognition · Computer Science 2024-05-24 Hagyeong Lee , Minkyu Kim , Jun-Hyuk Kim , Seungeon Kim , Dokwan Oh , Jaeho Lee

Learned image compression methods generally optimize a rate-distortion loss, trading off improvements in visual distortion for added bitrate. Increasingly, however, compressed imagery is used as an input to deep learning networks for…

Image and Video Processing · Electrical Eng. & Systems 2022-02-02 Maxime Kawawa-Beaudan , Ryan Roggenkemper , Avideh Zakhor

Transferring large amount of high resolution images over limited bandwidth is an important but very challenging task. Compressing images using extremely low bitrates (<0.1 bpp) has been studied but it often results in low quality images of…

Image and Video Processing · Electrical Eng. & Systems 2022-11-16 Zhihong Pan , Xin Zhou , Hao Tian

Multi-modal approaches employ data from multiple input streams such as textual and visual domains. Deep neural networks have been successfully employed for these approaches. In this paper, we present a novel multi-modal approach that fuses…

Computer Vision and Pattern Recognition · Computer Science 2018-10-05 Ignazio Gallo , Alessandro Calefati , Shah Nawaz , Muhammad Kamran Janjua

Learning-based image compression methods have emerged as state-of-the-art, showcasing higher performance compared to conventional compression solutions. These data-driven approaches aim to learn the parameters of a neural network model…

Multimedia · Computer Science 2024-03-20 Shima Mohammadi , Yaojun Wu , João Ascenso

Supported by powerful generative models, low-bitrate learned image compression (LIC) models utilizing perceptual metrics have become feasible. Some of the most advanced models achieve high compression rates and superior perceptual quality…

Image and Video Processing · Electrical Eng. & Systems 2024-11-21 Shimon Murai , Heming Sun , Jiro Katto

Handling various objects with different colors is a significant challenge for image colorization techniques. Thus, for complex real-world scenes, the existing image colorization algorithms often fail to maintain color consistency. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-04-26 Subhankar Ghosh , Saumik Bhattacharya , Prasun Roy , Umapada Pal , Michael Blumenstein

Image-text retrieval is a central problem for understanding the semantic relationship between vision and language, and serves as the basis for various visual and language tasks. Most previous works either simply learn coarse-grained…

Computer Vision and Pattern Recognition · Computer Science 2023-07-19 Chong Liu , Yuqi Zhang , Hongsong Wang , Weihua Chen , Fan Wang , Yan Huang , Yi-Dong Shen , Liang Wang

Multi-modal visual understanding of images with prompts involves using various visual and textual cues to enhance the semantic understanding of images. This approach combines both vision and language processing to generate more accurate…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Yuzhou Peng

Advancements in text-to-image generative AI with large multimodal models are spreading into the field of image compression, creating high-quality representation of images at extremely low bit rates. This work introduces novel components to…

Image and Video Processing · Electrical Eng. & Systems 2025-06-02 Cheng-Lin Wu , Hyomin Choi , Ivan V. Bajić

Fine-grained text-to-image retrieval aims to retrieve a fine-grained target image with a given text query. Existing methods typically assume that each training image is accurately depicted by its textual descriptions. However, textual…

Computer Vision and Pattern Recognition · Computer Science 2025-04-11 Zehong Ma , Hao Chen , Wei Zeng , Limin Su , Shiliang Zhang

Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal…

Computer Vision and Pattern Recognition · Computer Science 2019-01-09 Raul Gomez , Lluis Gomez , Jaume Gibert , Dimosthenis Karatzas

Recently, more and more images are compressed and sent to the back-end devices for the machine analysis tasks~(\textit{e.g.,} object detection) instead of being purely watched by humans. However, most traditional or learned image codecs are…

Image and Video Processing · Electrical Eng. & Systems 2022-06-14 Guo Lu , Xingtong Ge , Tianxiong Zhong , Jing Geng , Qiang Hu

Learned image compression sits at the intersection of machine learning and image processing. With advances in deep learning, neural network-based compression methods have emerged. In this process, an encoder maps the image to a…

Computer Vision and Pattern Recognition · Computer Science 2025-09-15 Fabien Allemand , Attilio Fiandrotti , Sumanta Chaudhuri , Alaa Eddine Mazouz

With the advancement of large-scale language modeling techniques, large multimodal models combining visual encoders with large language models have demonstrated exceptional performance in various visual tasks. Most of the current…

Computer Vision and Pattern Recognition · Computer Science 2024-12-20 Yi Chen , Jian Xu , Xu-Yao Zhang , Wen-Zhuo Liu , Yang-Yang Liu , Cheng-Lin Liu

We consider the problem of composed image retrieval that takes an input query consisting of an image and a modification text indicating the desired changes to be made on the image and retrieves images that match these changes. Current…

Computer Vision and Pattern Recognition · Computer Science 2023-09-01 Prateksha Udhayanan , Srikrishna Karanam , Balaji Vasan Srinivasan

In recent years, image compression for high-level vision tasks has attracted considerable attention from researchers. Given that object information in images plays a far more crucial role in downstream tasks than background information,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-18 Chengjie Dai , Tiantian Song , Hui Tang , Fangdong Chen , Bowei Yang , Guanghua Song
‹ Prev 1 2 3 10 Next ›