Related papers: Task-driven Semantic Coding via Reinforcement Lear…

Reinforced Bit Allocation under Task-Driven Semantic Distortion Metrics

Rapid growing intelligent applications require optimized bit allocation in image/video coding to support specific task-driven scenarios such as detection, classification, segmentation, etc. Some learning-based frameworks have been proposed…

Image and Video Processing · Electrical Eng. & Systems 2020-02-11 Jun Shi , Zhibo Chen

Hierarchical Reinforcement Learning Based Video Semantic Coding for Segmentation

The rapid development of intelligent tasks, e.g., segmentation, detection, classification, etc, has brought an urgent need for semantic compression, which aims to reduce the compression cost while maintaining the original semantic…

Image and Video Processing · Electrical Eng. & Systems 2022-08-25 Guangqi Xie , Xin Li , Shiqi Lin , Li Zhang , Kai Zhang , Yue Li , Zhibo Chen

RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression

Video encoders optimize compression for human perception by minimizing reconstruction error under bit-rate constraints. In many modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems…

Machine Learning · Computer Science 2025-03-26 Uri Gadot , Assaf Shocher , Shie Mannor , Gal Chechik , Assaf Hallak

Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning

Deep neural networks (DNNs) excel on fixed datasets but struggle with incremental and shifting data in real-world scenarios. Continual learning addresses this challenge by allowing models to learn from new data while retaining previously…

Computer Vision and Pattern Recognition · Computer Science 2025-06-10 Lu Yu , Zhe Tao , Dipam Goswami , Hantao Yao , Bartłomiej Twardowski , Joost Van de Weijer , Changsheng Xu

Task-driven Compression for Collision Encoding based on Depth Images

This paper contributes a novel learning-based method for aggressive task-driven compression of depth images and their encoding as images tailored to collision prediction for robotic systems. A novel 3D image processing methodology is…

Computer Vision and Pattern Recognition · Computer Science 2023-09-12 Mihir Kulkarni , Kostas Alexis

A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented, Temporal and Depth-aware design

Semantic image and video segmentation stand among the most important tasks in computer vision nowadays, since they provide a complete and meaningful representation of the environment by means of a dense classification of the pixels in a…

Computer Vision and Pattern Recognition · Computer Science 2023-03-09 Felipe Manfio Barbosa , Fernando Santos Osório

Task Oriented Video Coding: A Survey

Video coding technology has been continuously improved for higher compression ratio with higher resolution. However, the state-of-the-art video coding standards, such as H.265/HEVC and Versatile Video Coding, are still designed with the…

Image and Video Processing · Electrical Eng. & Systems 2022-11-22 Daniel Wood

Task-driven real-world super-resolution of document scans

Single-image super-resolution refers to the reconstruction of a high-resolution image from a single low-resolution observation. Although recent deep learning-based methods have demonstrated notable success on simulated datasets -- with…

Computer Vision and Pattern Recognition · Computer Science 2025-06-10 Maciej Zyrek , Tomasz Tarasiewicz , Jakub Sadel , Aleksandra Krzywon , Michal Kawulok

Towards Semantic Communications: Deep Learning-Based Image Semantic Coding

Semantic communications has received growing interest since it can remarkably reduce the amount of data to be transmitted without missing critical information. Most existing works explore the semantic encoding and transmission for text and…

Computer Vision and Pattern Recognition · Computer Science 2022-08-09 Danlan Huang , Feifei Gao , Xiaoming Tao , Qiyuan Du , Jianhua Lu

A Coding Framework and Benchmark towards Low-Bitrate Video Understanding

Video compression is indispensable to most video analysis systems. Despite saving transportation bandwidth, it also deteriorates downstream video understanding tasks, especially at low-bitrate settings. To systematically investigate this…

Image and Video Processing · Electrical Eng. & Systems 2024-09-24 Yuan Tian , Guo Lu , Yichao Yan , Guangtao Zhai , Li Chen , Zhiyong Gao

Deep Joint Source-Channel Coding Based on Semantics of Pixels

The semantic information of the image for intelligent tasks is hidden behind the pixels, and slight changes in the pixels will affect the performance of intelligent tasks. In order to preserve semantic information behind pixels for…

Image and Video Processing · Electrical Eng. & Systems 2022-08-25 Qizheng Sun , Caili Guo , Yang Yang , Jiujiu Chen , Rui Tang , Chuanhong Liu

Multichannel Semantic Segmentation with Unsupervised Domain Adaptation

Most contemporary robots have depth sensors, and research on semantic segmentation with RGBD images has shown that depth images boost the accuracy of segmentation. Since it is time-consuming to annotate images with semantic labels per…

Computer Vision and Pattern Recognition · Computer Science 2018-12-12 Kohei Watanabe , Kuniaki Saito , Yoshitaka Ushiku , Tatsuya Harada

Constructing and Interpreting Digital Twin Representations for Visual Reasoning via Reinforcement Learning

Visual reasoning may require models to interpret images and videos and respond to implicit text queries across diverse output formats, from pixel-level segmentation masks to natural language descriptions. Existing approaches rely on…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Yiqing Shen , Mathias Unberath

Bit Allocation Transfer for Perceptual Quality Enhancement of VVC Intra Coding

Mainstream image and video coding standards -- including state-of-the-art codecs like H.266/VVC, AVS3, and AV1 -- adopt a block-based hybrid coding framework. While this framework facilitates straightforward optimization for Peak…

Image and Video Processing · Electrical Eng. & Systems 2025-10-17 Runyu Yang , Ivan V. Bajić

Reinforcement Learning for Semantic Segmentation in Indoor Scenes

Future advancements in robot autonomy and sophistication of robotics tasks rest on robust, efficient, and task-dependent semantic understanding of the environment. Semantic segmentation is the problem of simultaneous segmentation and…

Computer Vision and Pattern Recognition · Computer Science 2016-06-06 Md. Alimoor Reza , Jana Kosecka

SFD2: Semantic-guided Feature Detection and Description

Visual localization is a fundamental task for various applications including autonomous driving and robotics. Prior methods focus on extracting large amounts of often redundant locally reliable features, resulting in limited efficiency and…

Computer Vision and Pattern Recognition · Computer Science 2023-06-13 Fei Xue , Ignas Budvytis , Roberto Cipolla

CLIP-SR: Collaborative Linguistic and Image Processing for Super-Resolution

Convolutional Neural Networks (CNNs) have significantly advanced Image Super-Resolution (SR), yet most CNN-based methods rely solely on pixel-based transformations, often leading to artifacts and blurring, particularly under severe…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Bingwen Hu , Heng Liu , Zhedong Zheng , Ping Liu

Learning Context-aware Task Reasoning for Efficient Meta-reinforcement Learning

Despite recent success of deep network-based Reinforcement Learning (RL), it remains elusive to achieve human-level efficiency in learning novel tasks. While previous efforts attempt to address this challenge using meta-learning strategies,…

Machine Learning · Computer Science 2022-05-03 Haozhe Wang , Jiale Zhou , Xuming He

Learning Program Semantics with Code Representations: An Empirical Study

Program semantics learning is the core and fundamental for various code intelligent tasks e.g., vulnerability detection, clone detection. A considerable amount of existing works propose diverse approaches to learn the program semantics for…

Software Engineering · Computer Science 2022-03-23 Jing Kai Siow , Shangqing Liu , Xiaofei Xie , Guozhu Meng , Yang Liu

Cross Modal Compression: Towards Human-comprehensible Semantic Compression

Traditional image/video compression aims to reduce the transmission/storage cost with signal fidelity as high as possible. However, with the increasing demand for machine analysis and semantic monitoring in recent years, semantic fidelity…

Image and Video Processing · Electrical Eng. & Systems 2022-09-07 Jiguo Li , Chuanmin Jia , Xinfeng Zhang , Siwei Ma , Wen Gao