Related papers: Deep Learning Methods for Efficient Large Scale Vi…
This paper describes our solution for the video recognition task of the Google Cloud and YouTube-8M Video Understanding Challenge that ranked the 3rd place. Because the challenge provides pre-extracted visual and audio features instead of…
YouTube-8M is the largest video dataset for multi-label video classification. In order to tackle the multi-label classification on this challenging dataset, it is necessary to solve several issues such as temporal modeling of videos, label…
We took part in the YouTube-8M Video Understanding Challenge hosted on Kaggle, and achieved the 10th place within less than one month's time. In this paper, we present an extensive analysis and solution to the underlying machine-learning…
This paper introduces the system we developed for the Google Cloud & YouTube-8M Video Understanding Challenge, which can be considered as a multi-label classification problem defined on top of the large scale YouTube-8M Dataset. We employ a…
Large-scale datasets have played a significant role in progress of neural network and deep learning areas. YouTube-8M is such a benchmark dataset for general multi-label video classification. It was created from over 7 million YouTube…
Video classification problem has been studied many years. The success of Convolutional Neural Networks (CNN) in image recognition tasks gives a powerful incentive for researchers to create more advanced video classification approaches. As…
Temporal localization remains an important challenge in video understanding. In this work, we present our solution to the 3rd YouTube-8M Video Understanding Challenge organized by Google Research. Participants were required to build a…
Despite recent advances in computer vision based on various convolutional architectures, video understanding remains an important challenge. In this work, we present and discuss a top solution for the large-scale video classification…
This article describes the final solution of team monkeytyping, who finished in second place in the YouTube-8M video understanding challenge. The dataset used in this challenge is a large-scale benchmark for multi-label video…
This paper describes our solution for the 2$^\text{nd}$ YouTube-8M video understanding challenge organized by Google AI. Unlike the video recognition benchmarks, such as Kinetics and Moments, the YouTube-8M challenge provides pre-extracted…
Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale.…
This paper introduces the YouTube-8M Video Understanding Challenge hosted as a Kaggle competition and also describes my approach to experimenting with various models. For each of my experiments, I provide the score result as well as…
We report on CMU Informedia Lab's system used in Google's YouTube 8 Million Video Understanding Challenge. In this multi-label video classification task, our pipeline achieved 84.675% and 84.662% GAP on our evaluation split and the official…
In this paper, we present our solution to Google YouTube-8M Video Classification Challenge 2017. We leveraged both video-level and frame-level features in the submission. For video-level classification, we simply used a 200-mixture Mixture…
Youtube-8M dataset enhances the development of large-scale video recognition technology as ImageNet dataset has encouraged image classification, recognition and detection of artificial intelligence fields. For this large video dataset, it…
We address temporal localization of events in large-scale video data, in the context of the Youtube-8M Segments dataset. This emerging field within video recognition can enable applications to identify the precise time a specified event…
This paper introduces the system we developed for the Youtube-8M Video Understanding Challenge, in which a large-scale benchmark dataset was used for multi-label video classification. The proposed framework contains hierarchical deep…
This paper presents the Axon AI's solution to the 2nd YouTube-8M Video Understanding Challenge, achieving the final global average precision (GAP) of 88.733% on the private test set (ranked 3rd among 394 teams, not considering the model…
Videos have become ubiquitous on the Internet. And video analysis can provide lots of information for detecting and recognizing objects as well as help people understand human actions and interactions with the real world. However, facing…
We investigate factors controlling DNN diversity in the context of the Google Cloud and YouTube-8M Video Understanding Challenge. While it is well-known that ensemble methods improve prediction performance, and that combining accurate but…