English
Related papers

Related papers: Hierarchical Memory Decoding for Video Captioning

200 papers

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent…

Computer Vision and Pattern Recognition · Computer Science 2018-11-26 Lorenzo Baraldi , Costantino Grana , Rita Cucchiara

Building correspondences across different modalities, such as video and language, has recently become critical in many visual recognition applications, such as video captioning. Inspired by machine translation, recent models tackle this…

Computer Vision and Pattern Recognition · Computer Science 2019-11-11 Silvio Olivastri , Gurkirt Singh , Fabio Cuzzolin

Recently, deep learning approach, especially deep Convolutional Neural Networks (ConvNets), have achieved overwhelming accuracy with fast processing speed for image classification. Incorporating temporal structure with deep ConvNets for…

Computer Vision and Pattern Recognition · Computer Science 2015-11-12 Pingbo Pan , Zhongwen Xu , Yi Yang , Fei Wu , Yueting Zhuang

Typical techniques for video captioning follow the encoder-decoder framework, which can only focus on one source video being processed. A potential disadvantage of such design is that it cannot capture the multiple visual context…

Computer Vision and Pattern Recognition · Computer Science 2019-05-13 Wenjie Pei , Jiyuan Zhang , Xiangrong Wang , Lei Ke , Xiaoyong Shen , Yu-Wing Tai

In this paper, the problem of describing visual contents of a video sequence with natural language is addressed. Unlike previous video captioning work mainly exploiting the cues of video contents to make a language description, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2018-04-02 Bairui Wang , Lin Ma , Wei Zhang , Wei Liu

Recently, much advance has been made in image captioning, and an encoder-decoder framework has been adopted by all the state-of-the-art models. Under this framework, an input image is encoded by a convolutional neural network (CNN) and then…

Computer Vision and Pattern Recognition · Computer Science 2018-08-01 Wenhao Jiang , Lin Ma , Yu-Gang Jiang , Wei Liu , Tong Zhang

Image captioning is a challenging task that combines the field of computer vision and natural language processing. A variety of approaches have been proposed to achieve the goal of automatically describing an image, and recurrent neural…

Computer Vision and Pattern Recognition · Computer Science 2018-05-24 Qingzhong Wang , Antoni B. Chan

In this paper, the problem of describing visual contents of a video sequence with natural language is addressed. Unlike previous video captioning work mainly exploiting the cues of video contents to make a language description, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2019-06-05 Wei Zhang , Bairui Wang , Lin Ma , Wei Liu

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance. In this paper, we show that vocabulary…

Computer Vision and Pattern Recognition · Computer Science 2019-09-02 Lei Ke , Wenjie Pei , Ruiyu Li , Xiaoyong Shen , Yu-Wing Tai

Video captioning which automatically translates video clips into natural language sentences is a very important task in computer vision. By virtue of recent deep learning technologies, e.g., convolutional neural networks (CNNs) and…

Computer Vision and Pattern Recognition · Computer Science 2016-11-18 Junbo Wang , Wei Wang , Yan Huang , Liang Wang , Tieniu Tan

It is well believed that video captioning is a fundamental but challenging task in both computer vision and artificial intelligence fields. The prevalent approach is to map an input video to a variable-length output sentence in a sequence…

Computer Vision and Pattern Recognition · Computer Science 2019-05-06 Jingwen Chen , Yingwei Pan , Yehao Li , Ting Yao , Hongyang Chao , Tao Mei

With the rapid growth of video data and the increasing demands of various applications such as intelligent video search and assistance toward visually-impaired people, video captioning task has received a lot of attention recently in…

Computer Vision and Pattern Recognition · Computer Science 2019-07-31 Xiangxi Shi , Jianfei Cai , Shafiq Joty , Jiuxiang Gu

Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep…

Computer Vision and Pattern Recognition · Computer Science 2015-04-14 Joe Yue-Hei Ng , Matthew Hausknecht , Sudheendra Vijayanarasimhan , Oriol Vinyals , Rajat Monga , George Toderici

The recent advances of deep learning in both computer vision (CV) and natural language processing (NLP) provide us a new way of understanding semantics, by which we can deal with more challenging tasks such as automatic description…

Computer Vision and Pattern Recognition · Computer Science 2019-02-12 Daouda Sow , Zengchang Qin , Mouhamed Niasse , Tao Wan

The explosion of video data on the internet requires effective and efficient technology to generate captions automatically for people who are not able to watch the videos. Despite the great progress of video captioning research,…

Computer Vision and Pattern Recognition · Computer Science 2018-07-11 Xiangxi Shi , Jianfei Cai , Jiuxiang Gu , Shafiq Joty

Countless learning tasks require dealing with sequential data. Image captioning, speech synthesis, and music generation all require that a model produce outputs that are sequences. In other domains, such as time series prediction, video…

Machine Learning · Computer Science 2015-10-20 Zachary C. Lipton , John Berkowitz , Charles Elkan

As a novel video representation method, Neural Representations for Videos (NeRV) has shown great potential in the fields of video compression, video restoration, and video interpolation. In the process of representing videos using NeRV,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-11 Qingling Chang , Haohui Yu , Shuxuan Fu , Zhiqiang Zeng , Chuangquan Chen

Image Captioning, or the automatic generation of descriptions for images, is one of the core problems in Computer Vision and has seen considerable progress using Deep Learning Techniques. We propose to use Inception-ResNet Convolutional…

Computer Vision and Pattern Recognition · Computer Science 2021-02-23 Sulabh Katiyar , Samir Kumar Borgohain

Recently Convolutional Neural Networks have been proposed for Sequence Modelling tasks such as Image Caption Generation. However, unlike Recurrent Neural Networks, the performance of Convolutional Neural Networks as Decoders for Image…

Computer Vision and Pattern Recognition · Computer Science 2021-03-09 Sulabh Katiyar , Samir Kumar Borgohain

The Convolutional Neural Network (CNN) has been the dominant image feature extractor in computer vision for years. However, it fails to get the relationship between images/objects and their hierarchical interactions which can be helpful for…

Computer Vision and Pattern Recognition · Computer Science 2019-12-05 Zheng-cong Fei
‹ Prev 1 2 3 10 Next ›