Related papers: Multi-Task Learning for Screen Content Image Codin…
In recent years, it has been found that screen content images (SCI) can be effectively compressed based on appropriate probability modelling and suitable entropy coding methods such as arithmetic coding. The key objective is determining the…
At present, and increasingly so in the future, much of the captured visual content will not be seen by humans. Instead, it will be used for automated machine vision analytics and may require occasional human viewing. Examples of such…
As an increasing amount of image and video content will be analyzed by machines, there is demand for a new codec paradigm that is capable of compressing visual input primarily for the purpose of computer vision inference, while secondarily…
An increasing share of image and video content is analyzed by machines rather than viewed by humans, and therefore it becomes relevant to optimize codecs for such applications where the analysis is performed remotely. Unfortunately,…
Despite significant progress toward super resolving more realistic images by deeper convolutional neural networks (CNNs), reconstructing fine and natural textures still remains a challenging problem. Recent works on single image super…
Semantic segmentation of outdoor scenes is problematic when there are variations in imaging conditions. It is known that albedo (reflectance) is invariant to all kinds of illumination effects. Thus, using reflectance images for semantic…
As learned image codecs (LICs) become more prevalent, their low coding efficiency for out-of-distribution data becomes a bottleneck for some applications. To improve the performance of LICs for screen content (SC) images without breaking…
Compression technology is essential for efficient image transmission and storage. With the rapid advances in deep learning, images are beginning to be used for image recognition as well as for human vision. For this reason, research has…
Autoencoder-based image codecs achieve state-of-the-art compression performance but often incur high computational complexity, particularly at decoding time. This work introduces a low-complexity learned image compression framework based on…
The learned image compression (LIC) methods have already surpassed traditional techniques in compressing natural scene (NS) images. However, directly applying these methods to screen content (SC) images, which possess distinct…
An increasing share of captured images and videos are transmitted for storage and remote analysis by computer vision algorithms, rather than to be viewed by humans. Contrary to traditional standard codecs with engineered tools, neural…
Snapshot compressive imaging (SCI) refers to compressive imaging systems where multiple frames are mapped into a single measurement, with video compressive imaging and hyperspectral compressive imaging as two representative applications.…
Semantic segmentation is the task of classifying each pixel in an image. Training a segmentation model achieves best results using annotated images, where each pixel is annotated with the corresponding class. When obtaining fine annotations…
Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given the dramatic explosion in the number of…
Video coding has traditionally been developed to support services such as video streaming, videoconferencing, digital TV, and so on. The main intent was to enable human viewing of the encoded content. However, with the advances in deep…
Recent advancements in deep learning-based image compression are notable. However, prevalent schemes that employ a serial context-adaptive entropy model to enhance rate-distortion (R-D) performance are markedly slow. Furthermore, the…
Optimal control and sequential decision making are widely used in many complex tasks. Optimal control over a sequence of natural images is a first step towards understanding the role of vision in control. Here, we formalize this problem as…
In recent years self-supervised learning has emerged as a promising candidate for unsupervised representation learning. In the visual domain its applications are mostly studied in the context of images of natural scenes. However, its…
Learning visual feature representations for video analysis is a daunting task that requires a large amount of training samples and a proper generalization framework. Many of the current state of the art methods for video captioning and…
Often in medical imaging, it is prohibitively challenging to produce enough boundary annotations to train deep neural networks for accurate tumor segmentation. We propose the use of weak labels about whether an image presents tumor or…