Related papers: Classifying Video based on Automatic Content Detec…

Deep Learning for Video Classification and Captioning

Accelerated by the tremendous increase in Internet bandwidth and storage space, video data has been generated, published and spread explosively, becoming an indispensable part of today's big data. In this paper, we focus on reviewing two…

Computer Vision and Pattern Recognition · Computer Science 2018-02-23 Zuxuan Wu , Ting Yao , Yanwei Fu , Yu-Gang Jiang

Deep Architectures for Content Moderation and Movie Content Rating

Rating a video based on its content is an important step for classifying video age categories. Movie content rating and TV show rating are the two most common rating systems established by professional committees. However, manually…

Computer Vision and Pattern Recognition · Computer Science 2022-12-13 Fatih Cagatay Akyon , Alptekin Temizel

Open Vocabulary Multi-Label Video Classification

Pre-trained vision-language models (VLMs) have enabled significant progress in open vocabulary computer vision tasks such as image classification, object detection and image segmentation. Some recent works have focused on extending VLMs to…

Computer Vision and Pattern Recognition · Computer Science 2025-10-14 Rohit Gupta , Mamshad Nayeem Rizve , Jayakrishnan Unnikrishnan , Ashish Tawari , Son Tran , Mubarak Shah , Benjamin Yao , Trishul Chilimbi

Active Learning for Video Classification with Frame Level Queries

Deep learning algorithms have pushed the boundaries of computer vision research and have depicted commendable performance in a variety of applications. However, training a robust deep neural network necessitates a large amount of labeled…

Computer Vision and Pattern Recognition · Computer Science 2023-07-13 Debanjan Goswami , Shayok Chakraborty

End-to-End Video Classification with Knowledge Graphs

Video understanding has attracted much research attention especially since the recent availability of large-scale video benchmarks. In this paper, we address the problem of multi-label video classification. We first observe that there…

Computer Vision and Pattern Recognition · Computer Science 2017-11-07 Fang Yuan , Zhe Wang , Jie Lin , Luis Fernando D'Haro , Kim Jung Jae , Zeng Zeng , Vijay Chandrasekhar

Content-Based Video Browsing by Text Region Localization and Classification

The amount of digital video data is increasing over the world. It highlights the need for efficient algorithms that can index, retrieve and browse this data by content. This can be achieved by identifying semantic description captured…

Multimedia · Computer Science 2013-01-11 Bassem Bouaziz , Walid Mahdi , Tarek Zlitni , Abdelmajid ben Hamadou

Video Content Classification using Deep Learning

Video content classification is an important research content in computer vision, which is widely used in many fields, such as image and video retrieval, computer vision. This paper presents a model that is a combination of Convolutional…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Pradyumn Patil , Vishwajeet Pawar , Yashraj Pawar , Shruti Pisal

Video Data Visualization System: Semantic Classification And Personalization

We present in this paper an intelligent video data visualization tool, based on semantic classification, for retrieving and exploring a large scale corpus of videos. Our work is based on semantic classification resulting from semantic…

Information Retrieval · Computer Science 2012-09-07 Jamel Slimi , Anis Ben Ammar , Adel M. Alimi

Contrastive Graph Multimodal Model for Text Classification in Videos

The extraction of text information in videos serves as a critical step towards semantic understanding of videos. It usually involved in two steps: (1) text recognition and (2) text classification. To localize texts in videos, we can resort…

Computer Vision and Pattern Recognition · Computer Science 2022-06-07 Ye Liu , Changchong Lu , Chen Lin , Di Yin , Bo Ren

Action Selection Learning for Multi-label Multi-view Action Recognition

Multi-label multi-view action recognition aims to recognize multiple concurrent or sequential actions from untrimmed videos captured by multiple cameras. Existing work has focused on multi-view action recognition in a narrow area with…

Computer Vision and Pattern Recognition · Computer Science 2024-10-21 Trung Thanh Nguyen , Yasutomo Kawanishi , Takahiro Komamizu , Ichiro Ide

Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

Multi-label image and video classification are fundamental yet challenging tasks in computer vision. The main challenges lie in capturing spatial or temporal dependencies between labels and discovering the locations of discriminative…

Computer Vision and Pattern Recognition · Computer Science 2020-03-30 Renchun You , Zhiyao Guo , Lei Cui , Xiang Long , Yingze Bao , Shilei Wen

A survey of image labelling for computer vision applications

Supervised machine learning methods for image analysis require large amounts of labelled training data to solve computer vision problems. The recent rise of deep learning algorithms for recognising image content has led to the emergence of…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Christoph Sager , Christian Janiesch , Patrick Zschech

VideoMCC: a New Benchmark for Video Comprehension

While there is overall agreement that future technology for organizing, browsing and searching videos hinges on the development of methods for high-level semantic understanding of video, so far no consensus has been reached on the best way…

Computer Vision and Pattern Recognition · Computer Science 2017-06-20 Du Tran , Maksim Bolonkin , Manohar Paluri , Lorenzo Torresani

Classroom Video Assessment and Retrieval via Multiple Instance Learning

We propose a multiple instance learning approach to content-based retrieval of classroom video for the purpose of supporting human assessing the learning environment. The key element of our approach is a mapping between the semantic…

Information Retrieval · Computer Science 2014-03-26 Qifeng Qiao , Peter A. Beling

Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important. In this paper, we address the problem of video scene recognition, whose goal is to learn a…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Xuzheng Yu , Chen Jiang , Wei Zhang , Tian Gan , Linlin Chao , Jianan Zhao , Yuan Cheng , Qingpei Guo , Wei Chu

A Survey on Deep Learning Technique for Video Segmentation

Video segmentation -- partitioning video frames into multiple segments or objects -- plays a critical role in a broad range of practical applications, from enhancing visual effects in movie, to understanding scenes in autonomous driving, to…

Computer Vision and Pattern Recognition · Computer Science 2022-11-30 Tianfei Zhou , Fatih Porikli , David Crandall , Luc Van Gool , Wenguan Wang

Audio-Visual Fusion Layers for Event Type Aware Video Recognition

Human brain is continuously inundated with the multisensory information and their complex interactions coming from the outside world at any given moment. Such information is automatically analyzed by binding or segregating in our brain.…

Computer Vision and Pattern Recognition · Computer Science 2022-02-15 Arda Senocak , Junsik Kim , Tae-Hyun Oh , Hyeonggon Ryu , Dingzeyu Li , In So Kweon

Encoding Video and Label Priors for Multi-label Video Classification on YouTube-8M dataset

YouTube-8M is the largest video dataset for multi-label video classification. In order to tackle the multi-label classification on this challenging dataset, it is necessary to solve several issues such as temporal modeling of videos, label…

Computer Vision and Pattern Recognition · Computer Science 2017-07-13 Seil Na , Youngjae Yu , Sangho Lee , Jisung Kim , Gunhee Kim

Multimodal Multilabel Classification by CLIP

Multimodal multilabel classification (MMC) is a challenging task that aims to design a learning algorithm to handle two data sources, the image and text, and learn a comprehensive semantic feature presentation across the modalities. In this…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Yanming Guo

A Survey on Machine Learning Techniques for Auto Labeling of Video, Audio, and Text Data

Machine learning has been utilized to perform tasks in many different domains such as classification, object detection, image segmentation and natural language analysis. Data labeling has always been one of the most important tasks in…

Machine Learning · Computer Science 2021-09-09 Shikun Zhang , Omid Jafari , Parth Nagarkar