Related papers: Decomposed Vector-Quantized Variational Autoencode…

Human Grasp Generation for Rigid and Deformable Objects with Decomposed VQ-VAE

Generating realistic human grasps is crucial yet challenging for object manipulation in computer graphics and robotics. Current methods often struggle to generate detailed and realistic grasps with full finger-object interaction, as they…

Robotics · Computer Science 2025-01-13 Mengshi Qi , Zhe Zhao , Huadong Ma

Semi-supervised Grasp Detection by Representation Learning in a Vector Quantized Latent Space

For a robot to perform complex manipulation tasks, it is necessary for it to have a good grasping ability. However, vision based robotic grasp detection is hindered by the unavailability of sufficient labelled data. Furthermore, the…

Machine Learning · Computer Science 2020-01-31 Mridul Mahajan , Tryambak Bhattacharjee , Arya Krishnan , Priya Shukla , G C Nandi

Disentangling Latent Hands for Image Synthesis and Pose Estimation

Hand image synthesis and pose estimation from RGB images are both highly challenging tasks due to the large discrepancy between factors of variation ranging from image background content to camera viewpoint. To better analyze these factors…

Computer Vision and Pattern Recognition · Computer Science 2019-04-29 Linlin Yang , Angela Yao

Guided Variational Autoencoder for Disentanglement Learning

We propose an algorithm, guided variational autoencoder (Guided-VAE), that is able to learn a controllable generative model by performing latent representation disentanglement learning. The learning objective is achieved by providing…

Computer Vision and Pattern Recognition · Computer Science 2020-04-06 Zheng Ding , Yifan Xu , Weijian Xu , Gaurav Parmar , Yang Yang , Max Welling , Zhuowen Tu

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook. However, they…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Mengqi Huang , Zhendong Mao , Zhuowei Chen , Yongdong Zhang

Gated Variational AutoEncoders: Incorporating Weak Supervision to Encourage Disentanglement

Variational AutoEncoders (VAEs) provide a means to generate representational latent embeddings. Previous research has highlighted the benefits of achieving representations that are disentangled, particularly for downstream tasks. However,…

Computer Vision and Pattern Recognition · Computer Science 2019-11-18 Matthew J. Vowels , Necati Cihan Camgoz , Richard Bowden

Quantum Down Sampling Filter for Variational Auto-encoder

Variational autoencoders (VAEs) are fundamental for generative modeling and image reconstruction, yet their performance often struggles to maintain high fidelity in reconstructions. This study introduces a hybrid model, quantum variational…

Computer Vision and Pattern Recognition · Computer Science 2025-03-10 Farina Riaz , Fakhar Zaman , Hajime Suzuki , Sharif Abuadbba , David Nguyen

Vector Quantized Wasserstein Auto-Encoder

Learning deep discrete latent presentations offers a promise of better symbolic and summarized abstractions that are more useful to subsequent downstream tasks. Inspired by the seminal Vector Quantized Variational Auto-Encoder (VQ-VAE),…

Machine Learning · Computer Science 2023-06-21 Tung-Long Vuong , Trung Le , He Zhao , Chuanxia Zheng , Mehrtash Harandi , Jianfei Cai , Dinh Phung

DeCo-VAE: Learning Compact Latents for Video Reconstruction via Decoupled Representation

Existing video Variational Autoencoders (VAEs) generally overlook the similarity between frame contents, leading to redundant latent modeling. In this paper, we propose decoupled VAE (DeCo-VAE) to achieve compact latent representation.…

Computer Vision and Pattern Recognition · Computer Science 2025-11-19 Xiangchen Yin , Jiahui Yuan , Zhangchi Hu , Wenzhang Sun , Jie Chen , Xiaozhen Qiao , Hao Li , Xiaoyan Sun

Generating Diverse High-Fidelity Images with VQ-VAE-2

We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher…

Machine Learning · Computer Science 2019-06-04 Ali Razavi , Aaron van den Oord , Oriol Vinyals

Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of…

Computer Vision and Pattern Recognition · Computer Science 2022-08-10 Mohammad Adiban , Kalin Stefanov , Sabato Marco Siniscalchi , Giampiero Salvi

DualVAE: Controlling Colours of Generated and Real Images

Colour controlled image generation and manipulation are of interest to artists and graphic designers. Vector Quantised Variational AutoEncoders (VQ-VAEs) with autoregressive (AR) prior are able to produce high quality images, but lack an…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Keerth Rathakumar , David Liebowitz , Christian Walder , Kristen Moore , Salil S. Kanhere

Representing 3D Shapes With 64 Latent Vectors for 3D Diffusion Models

Constructing a compressed latent space through a variational autoencoder (VAE) is the key for efficient 3D diffusion models. This paper introduces COD-VAE that encodes 3D shapes into a COmpact set of 1D latent vectors without sacrificing…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 In Cho , Youngbeom Yoo , Subin Jeon , Seon Joo Kim

Generative Hierarchical Temporal Transformer for Hand Pose and Action Modeling

We present a novel unified framework that concurrently tackles recognition and future prediction for human hand pose and action modeling. Previous works generally provide isolated solutions for either recognition or prediction, which not…

Computer Vision and Pattern Recognition · Computer Science 2024-09-10 Yilin Wen , Hao Pan , Takehiko Ohkawa , Lei Yang , Jia Pan , Yoichi Sato , Taku Komura , Wenping Wang

Disentangling Generative Factors of Physical Fields Using Variational Autoencoders

The ability to extract generative parameters from high-dimensional fields of data in an unsupervised manner is a highly desirable yet unrealized goal in computational physics. This work explores the use of variational autoencoders (VAEs)…

Computational Physics · Physics 2021-11-16 Christian Jacobsen , Karthik Duraisamy

Deep Quantization: Encoding Convolutional Activations with Deep Generative Model

Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher…

Computer Vision and Pattern Recognition · Computer Science 2016-11-30 Zhaofan Qiu , Ting Yao , Tao Mei

Hierarchical Graph-Convolutional Variational AutoEncoding for Generative Modelling of Human Motion

Models of human motion commonly focus either on trajectory prediction or action classification but rarely both. The marked heterogeneity and intricate compositionality of human motion render each task vulnerable to the data degradation and…

Computer Vision and Pattern Recognition · Computer Science 2022-06-08 Anthony Bourached , Robert Gray , Xiaodong Guan , Ryan-Rhys Griffiths , Ashwani Jha , Parashkev Nachev

Variational decomposition autoencoding improves disentanglement of latent representations

Understanding the structure of complex, nonstationary, high-dimensional time-evolving signals is a central challenge in scientific data analysis. In many domains, such as speech and biomedical signal processing, the ability to learn…

Machine Learning · Computer Science 2026-01-13 Ioannis Ziogas , Aamna Al Shehhi , Ahsan H. Khandoker , Leontios J. Hadjileontiadis

DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding

Human face exhibits an inherent hierarchy in its representations (i.e., holistic facial expressions can be encoded via a set of facial action units (AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown great results…

Computer Vision and Pattern Recognition · Computer Science 2017-08-08 Dieu Linh Tran , Robert Walecki , Ognjen Rudovic , Stefanos Eleftheriadis , Bjørn Schuller , Maja Pantic

Disentangled Hierarchical VAE for 3D Human-Human Interaction Generation

Generating realistic 3D Human-Human Interaction (HHI) requires coherent modeling of the physical plausibility of the agents and their interaction semantics. Existing methods compress all motion information into a single latent…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Zichen Geng , Zeeshan Hayder , Bo Miao , Jian Liu , Wei Liu , Ajmal Mian