Related papers: Object-Based Image Coding: A Learning-Driven Revis…

Accuracy Improvement of Object Detection in VVC Coded Video Using YOLO-v7 Features

With advances in image recognition technology based on deep learning, automatic video analysis by Artificial Intelligence is becoming more widespread. As the amount of video used for image recognition increases, efficient compression…

Computer Vision and Pattern Recognition · Computer Science 2023-04-04 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

Data-Efficient Image Recognition with Contrastive Predictive Coding

Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with artificial ones remains an open challenge. We hypothesize that data-efficient recognition is enabled by representations which make…

Computer Vision and Pattern Recognition · Computer Science 2020-07-02 Olivier J. Hénaff , Aravind Srinivas , Jeffrey De Fauw , Ali Razavi , Carl Doersch , S. M. Ali Eslami , Aaron van den Oord

Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image

Although deep convolutional neural network has been proved to efficiently eliminate coding artifacts caused by the coarse quantization of traditional codec, it's difficult to train any neural network in front of the encoder for gradient's…

Computer Vision and Pattern Recognition · Computer Science 2018-01-17 Lijun Zhao , Huihui Bai , Anhong Wang , Yao Zhao

PO-ELIC: Perception-Oriented Efficient Learned Image Coding

In the past years, learned image compression (LIC) has achieved remarkable performance. The recent LIC methods outperform VVC in both PSNR and MS-SSIM. However, the low bit-rate reconstructions of LIC suffer from artifacts such as blurring,…

Image and Video Processing · Electrical Eng. & Systems 2022-05-31 Dailan He , Ziming Yang , Hongjiu Yu , Tongda Xu , Jixiang Luo , Yuan Chen , Chenjian Gao , Xinjie Shi , Hongwei Qin , Yan Wang

High efficiency compression for object detection

Image and video compression has traditionally been tailored to human vision. However, modern applications such as visual analytics and surveillance rely on computers seeing and analyzing the images before (or instead of) humans. For these…

Image and Video Processing · Electrical Eng. & Systems 2018-02-19 Hyomin Choi , Ivan V. Bajic

VVC Extension Scheme for Object Detection Using Contrast Reduction

In recent years, video analysis using Artificial Intelligence (AI) has been widely used, due to the remarkable development of image recognition technology using deep learning. In 2019, the Moving Picture Experts Group (MPEG) has started…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition

Adaptive sparse coding methods learn a possibly overcomplete set of basis functions, such that natural image patches can be reconstructed by linearly combining a small subset of these bases. The applicability of these methods to visual…

Computer Vision and Pattern Recognition · Computer Science 2010-10-19 Koray Kavukcuoglu , Marc'Aurelio Ranzato , Yann LeCun

Image Coding for Machines with Object Region Learning

Compression technology is essential for efficient image transmission and storage. With the rapid advances in deep learning, images are beginning to be used for image recognition as well as for human vision. For this reason, research has…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Takahiro Shindo , Taiju Watanabe , Kein Yamada , Hiroshi Watanabe

Object-wise Masked Autoencoders for Fast Pre-training

Self-supervised pre-training for images without labels has recently achieved promising performance in image classification. The success of transformer-based methods, ViT and MAE, draws the community's attention to the design of backbone…

Computer Vision and Pattern Recognition · Computer Science 2022-05-31 Jiantao Wu , Shentong Mo

Extremely low-bitrate Image Compression Semantically Disentangled by LMMs from a Human Perception Perspective

It remains a significant challenge to compress images at extremely low bitrate while achieving both semantic consistency and high perceptual quality. Inspired by human progressive perception mechanism, we propose a Semantically Disentangled…

Computer Vision and Pattern Recognition · Computer Science 2025-10-15 Juan Song , Lijie Yang , Mingtao Feng

Are We Done with Object-Centric Learning?

Object-centric learning (OCL) seeks to learn representations that only encode an object, isolated from other objects or background cues in a scene. This approach underpins various aims, including out-of-distribution (OOD) generalization,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-14 Alexander Rubinstein , Ameya Prabhu , Matthias Bethge , Seong Joon Oh

From Image-level to Pixel-level Labeling with Convolutional Networks

We are interested in inferring object segmentation by leveraging only object class information, and by considering only minimal priors on the object segmentation task. This problem could be viewed as a kind of weakly supervised segmentation…

Computer Vision and Pattern Recognition · Computer Science 2015-04-27 Pedro O. Pinheiro , Ronan Collobert

Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos

We propose an end-to-end learning framework for segmenting generic objects in both images and videos. Given a novel image or video, our approach produces a pixel-level mask for all "object-like" regions---even for object categories never…

Computer Vision and Pattern Recognition · Computer Science 2018-12-19 Bo Xiong , Suyog Dutt Jain , Kristen Grauman

Optical Context Compression Is Just (Bad) Autoencoding

DeepSeek-OCR shows that rendered text can be reconstructed from a small number of vision tokens, sparking excitement about using vision as a compression medium for long textual contexts. But this pipeline requires rendering token embeddings…

Computer Vision and Pattern Recognition · Computer Science 2026-04-07 Ivan Yee Lee , Cheng Yang , Taylor Berg-Kirkpatrick

Efficient Masked Image Compression with Position-Indexed Self-Attention

In recent years, image compression for high-level vision tasks has attracted considerable attention from researchers. Given that object information in images plays a far more crucial role in downstream tasks than background information,…

Computer Vision and Pattern Recognition · Computer Science 2025-04-18 Chengjie Dai , Tiantian Song , Hui Tang , Fangdong Chen , Bowei Yang , Guanghua Song

Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features

Weakly-supervised semantic segmentation under image tags supervision is a challenging task as it directly associates high-level semantic to low-level appearance. To bridge this gap, in this paper, we propose an iterative bottom-up and…

Computer Vision and Pattern Recognition · Computer Science 2018-06-13 Xiang Wang , Shaodi You , Xi Li , Huimin Ma

Predictive Coding For Animation-Based Video Compression

We address the problem of efficiently compressing video for conferencing-type applications. We build on recent approaches based on image animation, which can achieve good reconstruction quality at very low bitrate by representing face…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Goluck Konuko , Stéphane Lathuilière , Giuseppe Valenzise

Deep Patch Learning for Weakly Supervised Object Classification and Discovery

Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background. Many existing methods usually require fine-grained…

Computer Vision and Pattern Recognition · Computer Science 2017-05-09 Peng Tang , Xinggang Wang , Zilong Huang , Xiang Bai , Wenyu Liu

Object Boundary Guided Semantic Segmentation

Semantic segmentation is critical to image content understanding and object localization. Recent development in fully-convolutional neural network (FCN) has enabled accurate pixel-level labeling. One issue in previous works is that the FCN…

Computer Vision and Pattern Recognition · Computer Science 2016-07-08 Qin Huang , Chunyang Xia , Wenchao Zheng , Yuhang Song , Hao Xu , C. -C. Jay Kuo

Exploring Open-Vocabulary Object Recognition in Images using CLIP

To address the limitations of existing open-vocabulary object recognition methods, specifically high system complexity, substantial training costs, and limited generalization, this paper proposes a novel Open-Vocabulary Object Recognition…

Computer Vision and Pattern Recognition · Computer Science 2026-03-09 Wei Yu Chen , Ying Dai