Related papers: Physical Representation-based Predicate Optimizati…

Efficient Large Scale Video Classification

Video classification has advanced tremendously over the recent years. A large part of the improvements in video classification had to do with the work done by the image classification community and the use of deep convolutional networks…

Computer Vision and Pattern Recognition · Computer Science 2015-05-26 Balakrishnan Varadarajan , George Toderici , Sudheendra Vijayanarasimhan , Apostol Natsev

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

Despite the steady progress in video analysis led by the adoption of convolutional neural networks (CNNs), the relative improvement has been less drastic as that in 2D static image classification. Three main challenges exist including…

Computer Vision and Pattern Recognition · Computer Science 2018-07-30 Saining Xie , Chen Sun , Jonathan Huang , Zhuowen Tu , Kevin Murphy

Beyond Short Snippets: Deep Networks for Video Classification

Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep…

Computer Vision and Pattern Recognition · Computer Science 2015-04-14 Joe Yue-Hei Ng , Matthew Hausknecht , Sudheendra Vijayanarasimhan , Oriol Vinyals , Rajat Monga , George Toderici

NoScope: Optimizing Neural Network Queries over Video at Scale

Recent advances in computer vision-in the form of deep neural networks-have made it possible to query increasing volumes of video data with high accuracy. However, neural network inference is computationally expensive at scale: applying a…

Databases · Computer Science 2017-08-10 Daniel Kang , John Emmons , Firas Abuzaid , Peter Bailis , Matei Zaharia

Efficient Classification of Very Large Images with Tiny Objects

An increasing number of applications in computer vision, specially, in medical imaging and remote sensing, become challenging when the goal is to classify very large images with tiny informative objects. Specifically, these classification…

Computer Vision and Pattern Recognition · Computer Science 2021-12-07 Fanjie Kong , Ricardo Henao

Efficient Image Evidence Analysis of CNN Classification Results

Convolutional neural networks (CNNs) define the current state-of-the-art for image recognition. With their emerging popularity, especially for critical applications like medical image analysis or self-driving cars, confirmability is…

Computer Vision and Pattern Recognition · Computer Science 2018-01-08 Keyang Zhou , Bernhard Kainz

Exploiting Local Indexing and Deep Feature Confidence Scores for Fast Image-to-Video Search

The cost-effective visual representation and fast query-by-example search are two challenging goals that should be maintained for web-scale visual retrieval tasks on moderate hardware. This paper introduces a fast and robust method that…

Computer Vision and Pattern Recognition · Computer Science 2020-12-15 Savas Ozkan , Gozde Bozdagi Akar

What Is the Best Practice for CNNs Applied to Visual Instance Retrieval?

Previous work has shown that feature maps of deep convolutional neural networks (CNNs) can be interpreted as feature representation of a particular image region. Features aggregated from these feature maps have been exploited for image…

Computer Vision and Pattern Recognition · Computer Science 2016-11-08 Jiedong Hao , Jing Dong , Wei Wang , Tieniu Tan

Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks

Advanced video classification systems decode video frames to derive the necessary texture and motion representations for ingestion and analysis by spatio-temporal deep convolutional neural networks (CNNs). However, when considering visual…

Computer Vision and Pattern Recognition · Computer Science 2019-01-03 Mohammad Jubran , Alhabib Abbas , Aaron Chadha , Yiannis Andreopoulos

A Discriminative CNN Video Representation for Event Detection

In this paper, we propose a discriminative video representation for event detection over a large scale video dataset when only limited hardware resources are available. The focus of this paper is to effectively leverage deep Convolutional…

Computer Vision and Pattern Recognition · Computer Science 2014-11-17 Zhongwen Xu , Yi Yang , Alexander G. Hauptmann

Efficient Document Image Classification Using Region-Based Graph Neural Network

Document image classification remains a popular research area because it can be commercialized in many enterprise applications across different industries. Recent advancements in large pre-trained computer vision and language models and…

Computer Vision and Pattern Recognition · Computer Science 2021-06-28 Jaya Krishna Mandivarapu , Eric Bunch , Qian You , Glenn Fung

Indexing of CNN Features for Large Scale Image Search

The convolutional neural network (CNN) features can give a good description of image content, which usually represent images with unique global vectors. Although they are compact compared to local descriptors, they still cannot efficiently…

Computer Vision and Pattern Recognition · Computer Science 2018-02-02 Ruoyu Liu , Yao Zhao , Shikui Wei , Yi Yang

A Picture May Be Worth a Hundred Words for Visual Question Answering

How far can we go with textual representations for understanding pictures? In image understanding, it is essential to use concise but detailed image representations. Deep visual features extracted by vision models, such as Faster R-CNN, are…

Computer Vision and Pattern Recognition · Computer Science 2021-06-28 Yusuke Hirota , Noa Garcia , Mayu Otani , Chenhui Chu , Yuta Nakashima , Ittetsu Taniguchi , Takao Onoye

LRCP: Low-Rank Compressibility Guided Visual Token Pruning for Efficient LVLMs

Large vision-language models (LVLMs) achieve strong multimodal understanding, but their inference cost grows rapidly with the number of visual tokens, especially for high-resolution images and long videos. Existing attention-based methods…

Computer Vision and Pattern Recognition · Computer Science 2026-05-18 Hongyu Lu , Feng Zhang , Wenwei Jin , Huanling Hu , Tianjun Shi , Shikai Jiang , Yao Hu , Jiawei Li

Hierarchical Video Frame Sequence Representation with Deep Convolutional Graph Network

High accuracy video label prediction (classification) models are attributed to large scale data. These data could be frame feature sequences extracted by a pre-trained convolutional-neural-network, which promote the efficiency for creating…

Computer Vision and Pattern Recognition · Computer Science 2019-06-04 Feng Mao , Xiang Wu , Hui Xue , Rong Zhang

Real-Time Document Image Classification using Deep CNN and Extreme Learning Machines

This paper presents an approach for real-time training and testing for document image classification. In production environments, it is crucial to perform accurate and (time-)efficient training. Existing deep learning approaches for…

Computer Vision and Pattern Recognition · Computer Science 2018-03-28 Andreas Kölsch , Muhammad Zeshan Afzal , Markus Ebbecke , Marcus Liwicki

Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning

Recently, deep learning approach, especially deep Convolutional Neural Networks (ConvNets), have achieved overwhelming accuracy with fast processing speed for image classification. Incorporating temporal structure with deep ConvNets for…

Computer Vision and Pattern Recognition · Computer Science 2015-11-12 Pingbo Pan , Zhongwen Xu , Yi Yang , Fei Wu , Yueting Zhuang

Fast Graph Neural Network for Image Classification

The rapid progress in image classification has been largely driven by the adoption of Graph Convolutional Networks (GCNs), which offer a robust framework for handling complex data structures. This study introduces a novel approach that…

Computer Vision and Pattern Recognition · Computer Science 2025-08-22 Mustafa Mohammadi Gharasuie , Luis Rueda

Is a Video worth $n\times n$ Images? A Highly Efficient Approach to Transformer-based Video Question Answering

Conventional Transformer-based Video Question Answering (VideoQA) approaches generally encode frames independently through one or more image encoders followed by interaction between frames and question. However, such schema would incur…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Chenyang Lyu , Tianbo Ji , Yvette Graham , Jennifer Foster

Context Aware Query Image Representation for Particular Object Retrieval

The current models of image representation based on Convolutional Neural Networks (CNN) have shown tremendous performance in image retrieval. Such models are inspired by the information flow along the visual pathway in the human visual…

Computer Vision and Pattern Recognition · Computer Science 2017-03-06 Zakaria Laskar , Juho Kannala