Related papers: Multiresolution Convolutional Autoencoders
In this paper, we present a novel neuroevolutionary method to identify the architecture and hyperparameters of convolutional autoencoders. Remarkably, we used a hypervolume indicator in the context of neural architecture search for…
A data-driven framework is proposed towards the end of predictive modeling of complex spatio-temporal dynamics, leveraging nested non-linear manifolds. Three levels of neural networks are used, with the goal of predicting the future state…
Building scalable models to learn from diverse, multimodal data remains an open challenge. For vision-language data, the dominant approaches are based on contrastive learning objectives that train a separate encoder for each modality. While…
Temporal modeling in complex systems requires capturing dependencies across multiple time scales while managing inherent uncertainties. We propose HierCVAE, a novel architecture that integrates hierarchical attention mechanisms with…
In this paper, we propose Multiresolution Equivariant Graph Variational Autoencoders (MGVAE), the first hierarchical generative model to learn and generate graphs in a multiresolution and equivariant manner. At each resolution level, MGVAE…
Broadcast and media organizations increasingly rely on artificial intelligence to automate the labor-intensive processes of content indexing, tagging, and metadata generation. However, existing AI systems typically operate on a single…
We propose the Motion Capsule Autoencoder (MCAE), which addresses a key challenge in the unsupervised learning of motion representations: transformation invariance. MCAE models motion in a two-level hierarchy. In the lower level, a…
Masked Autoencoders (MAE) have shown great potentials in self-supervised pre-training for language and 2D image transformers. However, it still remains an open question on how to exploit masked autoencoding for learning 3D representations…
Most of the existing multi-relational network embedding methods, e.g., TransE, are formulated to preserve pair-wise connectivity structures in the networks. With the observations that significant triangular connectivity structures and…
Medical images are acquired at high resolutions with large fields of view in order to capture fine-grained features necessary for clinical decision-making. Consequently, training deep learning models on medical images can incur large…
This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core…
Autoencoders have emerged as powerful models for visualization and dimensionality reduction based on the fundamental assumption that high-dimensional data is generated from a low-dimensional manifold. A critical challenge in autoencoder…
Videos captured from multiple viewpoints can help in perceiving the 3D structure of the world and benefit computer vision tasks such as action recognition, tracking, etc. In this paper, we present a method for self-supervised learning from…
Recurrent Neural Networks (RNN) received a vast amount of attention last decade. Recently, the architectures of Recurrent AutoEncoders (RAE) found many applications in practice. RAE can extract the semantically valuable information, called…
We demonstrate the efficiencies and explanatory abilities of extensions to the common tools of Autoencoders and LLM interpreters, in the novel context of comparing different cultural approaches to the same international news event. We…
We propose a MultiScale AutoEncoder(MSAE) based extreme image compression framework to offer visually pleasing reconstruction at a very low bitrate. Our method leverages the "priors" at different resolution scale to improve the compression…
In this paper we explore the effect of architectural choices on learning a Variational Autoencoder (VAE) for text generation. In contrast to the previously introduced VAE model for text where both the encoder and decoder are RNNs, we…
In this study, we develop a novel multi-fidelity deep learning approach that transforms low-fidelity solution maps into high-fidelity ones by incorporating parametric space information into a standard autoencoder architecture. This method's…
Image compression has been investigated as a fundamental research topic for many decades. Recently, deep learning has achieved great success in many computer vision tasks, and is gradually being used in image compression. In this paper, we…
The omnipresence of deep learning architectures such as deep convolutional neural networks (CNN)s is fueled by the synergistic combination of ever-increasing labeled datasets and specialized hardware. Despite the indisputable success, the…