Related papers: Video prediction using score-based conditional den…

An Uncertain Future: Forecasting from Static Images using Variational Autoencoders

In a given scene, humans can often easily predict a set of immediate future events that might happen. However, generalized pixel-level anticipation in computer vision systems is difficult because machine learning struggles with the…

Computer Vision and Pattern Recognition · Computer Science 2016-06-28 Jacob Walker , Carl Doersch , Abhinav Gupta , Martial Hebert

Feature-guided score diffusion for sampling conditional densities

Score diffusion methods can learn probability densities from samples. The score of the noise-corrupted density is estimated using a deep neural network, which is then used to iteratively transport a Gaussian white noise density to a target…

Computer Vision and Pattern Recognition · Computer Science 2024-10-16 Zahra Kadkhodaie , Stéphane Mallat , Eero P. Simoncelli

Video Scene Parsing with Predictive Feature Learning

In this work, we address the challenging video scene parsing problem by developing effective representation learning methods given limited parsing annotations. In particular, we contribute two novel methods that constitute a unified parsing…

Computer Vision and Pattern Recognition · Computer Science 2016-12-14 Xiaojie Jin , Xin Li , Huaxin Xiao , Xiaohui Shen , Zhe Lin , Jimei Yang , Yunpeng Chen , Jian Dong , Luoqi Liu , Zequn Jie , Jiashi Feng , Shuicheng Yan

Predictive-Corrective Networks for Action Detection

While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing. Architectures and optimization techniques used for video are largely based off those for static…

Computer Vision and Pattern Recognition · Computer Science 2017-12-13 Achal Dave , Olga Russakovsky , Deva Ramanan

Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning - leveraging unlabeled examples to learn about the structure of a domain - remains a difficult…

Machine Learning · Computer Science 2017-03-02 William Lotter , Gabriel Kreiman , David Cox

Fourier-based Video Prediction through Relational Object Motion

The ability to predict future outcomes conditioned on observed video frames is crucial for intelligent decision-making in autonomous systems. Recently, deep recurrent architectures have been applied to the task of video prediction. However,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-13 Malte Mosbach , Sven Behnke

Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Recent advances in deep learning have significantly improved performance of video prediction. However, state-of-the-art methods still suffer from blurriness and distortions in their future predictions, especially when there are large…

Computer Vision and Pattern Recognition · Computer Science 2020-03-20 Osamu Shouno

On the difficulty of learning and predicting the long-term dynamics of bouncing objects

The ability to accurately predict the surrounding environment is a foundational principle of intelligence in biological and artificial agents. In recent years, a variety of approaches have been proposed for learning to predict the physical…

Computer Vision and Pattern Recognition · Computer Science 2019-08-01 Alberto Cenzato , Alberto Testolin , Marco Zorzi

Efficient Video Segmentation Models with Per-frame Inference

Most existing real-time deep models trained with each frame independently may produce inconsistent results across the temporal axis when tested on a video sequence. A few methods take the correlations in the video sequence into…

Computer Vision and Pattern Recognition · Computer Science 2022-02-28 Yifan Liu , Chunhua Shen , Changqian Yu , Jingdong Wang

Predicting Long-horizon Futures by Conditioning on Geometry and Time

Our work explores the task of generating future sensor observations conditioned on the past. We are motivated by `predictive coding' concepts from neuroscience as well as robotic applications such as self-driving vehicles. Predictive video…

Computer Vision and Pattern Recognition · Computer Science 2024-04-18 Tarasha Khurana , Deva Ramanan

A Review on Deep Learning Techniques for Video Prediction

The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a…

Computer Vision and Pattern Recognition · Computer Science 2021-04-22 Sergiu Oprea , Pablo Martinez-Gonzalez , Alberto Garcia-Garcia , John Alejandro Castro-Vargas , Sergio Orts-Escolano , Jose Garcia-Rodriguez , Antonis Argyros

Streamlined Dense Video Captioning

Dense video captioning is an extremely challenging task since accurate and coherent description of events in a video requires holistic understanding of video contents as well as contextual reasoning of individual events. Most existing…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Jonghwan Mun , Linjie Yang , Zhou Ren , Ning Xu , Bohyung Han

Compositional Scene Representation Learning via Reconstruction: A Survey

Visual scenes are composed of visual concepts and have the property of combinatorial explosion. An important reason for humans to efficiently learn from diverse visual scenes is the ability of compositional perception, and it is desirable…

Machine Learning · Computer Science 2023-06-16 Jinyang Yuan , Tonglin Chen , Bin Li , Xiangyang Xue

Pixel Recurrent Neural Networks

Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts…

Computer Vision and Pattern Recognition · Computer Science 2016-08-22 Aaron van den Oord , Nal Kalchbrenner , Koray Kavukcuoglu

Visual Representation Learning with Stochastic Frame Prediction

Self-supervised learning of image representations by predicting future frames is a promising direction but still remains a challenge. This is because of the under-determined nature of frame prediction; multiple potential futures can arise…

Computer Vision and Pattern Recognition · Computer Science 2024-08-12 Huiwon Jang , Dongyoung Kim , Junsu Kim , Jinwoo Shin , Pieter Abbeel , Younggyo Seo

Consistent Generative Query Networks

Stochastic video prediction models take in a sequence of image frames, and generate a sequence of consecutive future image frames. These models typically generate future frames in an autoregressive fashion, which is slow and requires the…

Computer Vision and Pattern Recognition · Computer Science 2019-04-23 Ananya Kumar , S. M. Ali Eslami , Danilo J. Rezende , Marta Garnelo , Fabio Viola , Edward Lockhart , Murray Shanahan

Long-Term Image Boundary Prediction

Boundary estimation in images and videos has been a very active topic of research, and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception. While prior work has focused on…

Computer Vision and Pattern Recognition · Computer Science 2017-11-27 Apratim Bhattacharyya , Mateusz Malinowski , Bernt Schiele , Mario Fritz

Learning Blind Video Temporal Consistency

Applying image processing algorithms independently to each frame of a video often leads to undesired inconsistent results over time. Developing temporally consistent video-based extensions, however, requires domain knowledge for individual…

Computer Vision and Pattern Recognition · Computer Science 2018-08-02 Wei-Sheng Lai , Jia-Bin Huang , Oliver Wang , Eli Shechtman , Ersin Yumer , Ming-Hsuan Yang

Learning Robust Object Recognition Using Composed Scenes from Generative Models

Recurrent feedback connections in the mammalian visual system have been hypothesized to play a role in synthesizing input in the theoretical framework of analysis by synthesis. The comparison of internally synthesized representation with…

Computer Vision and Pattern Recognition · Computer Science 2017-05-23 Hao Wang , Xingyu Lin , Yimeng Zhang , Tai Sing Lee

Density Ratio Estimation with Conditional Probability Paths

Density ratio estimation in high dimensions can be reframed as integrating a certain quantity, the time score, over probability paths which interpolate between the two densities. In practice, the time score has to be estimated based on…

Machine Learning · Computer Science 2025-06-13 Hanlin Yu , Arto Klami , Aapo Hyvärinen , Anna Korba , Omar Chehab