Related papers: Temporal Convolution Networks with Positional Enco…
In this paper we describe a solution to our entry for the emotion recognition challenge EmotiW 2017. We propose an ensemble of several models, which capture spatial and audio features from videos. Spatial features are captured by…
Human emotion recognition holds a pivotal role in facilitating seamless human-computer interaction. This paper delineates our methodology in tackling the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge,…
Facial expressions are one of the most powerful ways for depicting specific patterns in human behavior and describing human emotional state. Despite the impressive advances of affective computing over the last decade, automatic video-based…
This study reports an unintuitive finding that positional encoding enhances learning of recurrent neural networks (RNNs). Positional encoding is a high-dimensional representation of time indices on input data. Most famously, positional…
We propose an off-line approach to explicitly encode temporal patterns spatially as different types of images, namely, Gramian Angular Fields and Markov Transition Fields. This enables the use of techniques from computer vision for feature…
Videos can evoke a range of affective responses in viewers. The ability to predict evoked affect from a video, before viewers watch the video, can help in content creation and video recommendation. We introduce the Evoked Expressions from…
It is well believed that video captioning is a fundamental but challenging task in both computer vision and artificial intelligence fields. The prevalent approach is to map an input video to a variable-length output sentence in a sequence…
Recent development in deep learning techniques has attracted attention in decoding and classification in EEG signals. Despite several efforts utilizing different features of EEG signals, a significant research challenge is to use…
Both the temporal dynamics and spatial correlations of Electroencephalogram (EEG), which contain discriminative emotion information, are essential for the emotion recognition. However, some redundant information within the EEG signals would…
Recognising previously visited locations is an important, but unsolved, task in autonomous navigation. Current visual place recognition (VPR) benchmarks typically challenge models to recover the position of a query image (or images) from…
Recently, deep learning approach, especially deep Convolutional Neural Networks (ConvNets), have achieved overwhelming accuracy with fast processing speed for image classification. Incorporating temporal structure with deep ConvNets for…
We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on…
One of the challenges in modeling cognitive events from electroencephalogram (EEG) data is finding representations that are invariant to inter- and intra-subject differences, as well as to inherent noise associated with such data. Herein,…
In this work we tackle the task of video-based visual emotion recognition in the wild. Standard methodologies that rely solely on the extraction of bodily and facial features often fall short of accurate emotion prediction in cases where…
We propose a novel recurrent encoder-decoder network model for real-time video-based face alignment. Our proposed model predicts 2D facial point maps regularized by a regression loss, while uniquely exploiting recurrent learning at both…
In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems. Our innovative approach is designed for the demanding…
Deep neural networks have achieved remarkable success for video-based action recognition. However, most of existing approaches cannot be deployed in practice due to the high computational cost. To address this challenge, we propose a new…
Representation learning on static graph-structured data has shown a significant impact on many real-world applications. However, less attention has been paid to the evolving nature of temporal networks, in which the edges are often changing…
Anticipating human actions is an important task that needs to be addressed for the development of reliable intelligent agents, such as self-driving cars or robot assistants. While the ability to make future predictions with high accuracy is…
Multivariate time series forecasting enables the prediction of future states by leveraging historical data, thereby facilitating decision-making processes. Each data node in a multivariate time series encompasses a sequence of multiple…