Related papers: IVCA: Inter-Relation-Aware Video Complexity Analyz…

Green Video Complexity Analysis for Efficient Encoding in Adaptive Video Streaming

For adaptive streaming applications, low-complexity and accurate video complexity features are necessary to analyze the video content in real time, which ensures fast and compression-efficient video streaming without disruptions.…

Multimedia · Computer Science 2023-04-26 Vignesh V Menon , Christian Feldmann , Klaus Schoeffmann , Mohammad Ghanbari , Christian Timmerer

Relation-aware Hierarchical Attention Framework for Video Question Answering

Video Question Answering (VideoQA) is a challenging video understanding task since it requires a deep understanding of both question and video. Previous studies mainly focus on extracting sophisticated visual and language embeddings, fusing…

Computer Vision and Pattern Recognition · Computer Science 2021-05-17 Fangtao Li , Ting Bai , Chenyu Cao , Zihe Liu , Chenghao Yan , Bin Wu

Enhancing Blind Video Quality Assessment with Rich Quality-aware Features

Blind video quality assessment (BVQA) is a highly challenging task due to the intrinsic complexity of video content and visual distortions, especially given the high popularity of social media videos, which originate from a wide range of…

Image and Video Processing · Electrical Eng. & Systems 2026-01-06 Wei Sun , Linhan Cao , Jun Jia , Zhichao Zhang , Zicheng Zhang , Xiongkuo Min , Guangtao Zhai

LLMs Meet Long Video: Advancing Long Video Question Answering with An Interactive Visual Adapter in LLMs

Long video understanding is a significant and ongoing challenge in the intersection of multimedia and artificial intelligence. Employing large language models (LLMs) for comprehending video becomes an emerging and promising method. However,…

Computation and Language · Computer Science 2024-08-27 Yunxin Li , Xinyu Chen , Baotain Hu , Min Zhang

DIVA-VQA: Detecting Inter-frame Variations in UGC Video Quality

The rapid growth of user-generated (video) content (UGC) has driven increased demand for research on no-reference (NR) perceptual video quality assessment (VQA). NR-VQA is a key component for large-scale video quality monitoring in social…

Image and Video Processing · Electrical Eng. & Systems 2025-08-15 Xinyi Wang , Angeliki Katsenou , David Bull

Comparison Drives Preference: Reference-Aware Modeling for AI-Generated Video Quality Assessment

The rapid advancement of generative models has led to a growing volume of AI-generated videos, making the automatic quality assessment of such videos increasingly important. Existing AI-generated content video quality assessment (AIGC-VQA)…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Minghao Zou , Gen Liu , Guanghui Yue , Baoquan Zhao , Zhihua Wang , Paul L. Rosin , Hantao Liu , Wei Zhou

Collaborative Weakly Supervised Video Correlation Learning for Procedure-Aware Instructional Video Analysis

Video Correlation Learning (VCL), which aims to analyze the relationships between videos, has been widely studied and applied in various general video tasks. However, applying VCL to instructional videos is still quite challenging due to…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Tianyao He , Huabin Liu , Yuxi Li , Xiao Ma , Cheng Zhong , Yang Zhang , Weiyao Lin

VCA: Video Curious Agent for Long Video Understanding

Long video understanding poses unique challenges due to their temporal complexity and low information density. Recent works address this task by sampling numerous frames or incorporating auxiliary tools using LLMs, both of which result in…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Zeyuan Yang , Delin Chen , Xueyang Yu , Maohao Shen , Chuang Gan

Motion-Aware Video Frame Interpolation

Video frame interpolation methodologies endeavor to create novel frames betwixt extant ones, with the intent of augmenting the video's frame frequency. However, current methods are prone to image blurring and spurious artifacts in…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Pengfei Han , Fuhua Zhang , Bin Zhao , Xuelong Li

FAVER: Blind Quality Prediction of Variable Frame Rate Videos

Video quality assessment (VQA) remains an important and challenging problem that affects many applications at the widest scales. Recent advances in mobile devices and cloud computing techniques have made it possible to capture, process, and…

Image and Video Processing · Electrical Eng. & Systems 2022-01-06 Qi Zheng , Zhengzhong Tu , Pavan C. Madhusudana , Xiaoyang Zeng , Alan C. Bovik , Yibo Fan

Image Complexity-Aware Adaptive Retrieval for Efficient Vision-Language Models

Vision transformers in vision-language models typically use the same amount of compute for every image, regardless of whether it is simple or complex. We propose ICAR (Image Complexity-Aware Retrieval), an adaptive computation approach that…

Information Retrieval · Computer Science 2026-01-16 Mikel Williams-Lekuona , Georgina Cosma

Transforming Multi-Concept Attention into Video Summarization

Video summarization is among challenging tasks in computer vision, which aims at identifying highlight frames or shots over a lengthy video input. In this paper, we propose an novel attention-based framework for video summarization with…

Computer Vision and Pattern Recognition · Computer Science 2020-06-04 Yen-Ting Liu , Yu-Jhe Li , Yu-Chiang Frank Wang

Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition

Leveraging complementary relationships across modalities has recently drawn a lot of attention in multimodal emotion recognition. Most of the existing approaches explored cross-attention to capture the complementary relationships across the…

Computer Vision and Pattern Recognition · Computer Science 2024-07-02 G Rajasekhar , Jahangir Alam

Video Quality Assessment: A Comprehensive Survey

Video quality assessment (VQA) is an important processing task, aiming at predicting the quality of videos in a manner highly consistent with human judgments of perceived quality. Traditional VQA models based on natural image and/or video…

Image and Video Processing · Electrical Eng. & Systems 2024-12-12 Qi Zheng , Yibo Fan , Leilei Huang , Tianyu Zhu , Jiaming Liu , Zhijian Hao , Shuo Xing , Chia-Ju Chen , Xiongkuo Min , Alan C. Bovik , Zhengzhong Tu

SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and…

Machine Learning · Statistics 2017-11-09 Maithra Raghu , Justin Gilmer , Jason Yosinski , Jascha Sohl-Dickstein

Temporal Context Aggregation for Video Retrieval with Contrastive Learning

The current research focus on Content-Based Video Retrieval requires higher-level video representation describing the long-range semantic dependencies of relevant incidents, events, etc. However, existing methods commonly process the frames…

Computer Vision and Pattern Recognition · Computer Science 2020-10-01 Jie Shao , Xin Wen , Bingchen Zhao , Xiangyang Xue

REVEAL: Relation-based Video Representation Learning for Video-Question-Answering

Video-Question-Answering (VideoQA) comprises the capturing of complex visual relation changes over time, remaining a challenge even for advanced Video Language Models (VLM), i.a., because of the need to represent the visual content to a…

Computer Vision and Pattern Recognition · Computer Science 2025-04-09 Sofian Chaybouti , Walid Bousselham , Moritz Wolter , Hilde Kuehne

Extending Information Bottleneck Attribution to Video Sequences

We introduce VIBA, a novel approach for explainable video classification by adapting Information Bottlenecks for Attribution (IBA) to video sequences. While most traditional explainability methods are designed for image models, our IBA…

Computer Vision and Pattern Recognition · Computer Science 2025-01-29 Veronika Solopova , Lucas Schmidt , Dorothea Kolossa

Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Perceptual quality assessment of the videos acquired in the wilds is of vital importance for quality assurance of video services. The inaccessibility of reference videos with pristine quality and the complexity of authentic distortions pose…

Image and Video Processing · Electrical Eng. & Systems 2022-04-06 Bowen Li , Weixia Zhang , Meng Tian , Guangtao Zhai , Xianpei Wang

Preparing VVC for Streaming: A Fast Multi-Rate Encoding Approach

The integration of advanced video codecs into the streaming pipeline is growing in response to the increasing demand for high quality video content. However, the significant computational demand for advanced codecs like Versatile Video…

Multimedia · Computer Science 2023-12-14 Yiqun Liu , Hadi Amirpour , Mohsen Abdoli , Christian Timmerer , Thomas Guionnet