English
Related papers

Related papers: Decomposed Vector-Quantized Variational Autoencode…

200 papers

Generating realistic human grasps is crucial yet challenging for object manipulation in computer graphics and robotics. Current methods often struggle to generate detailed and realistic grasps with full finger-object interaction, as they…

Robotics · Computer Science 2025-01-13 Mengshi Qi , Zhe Zhao , Huadong Ma

For a robot to perform complex manipulation tasks, it is necessary for it to have a good grasping ability. However, vision based robotic grasp detection is hindered by the unavailability of sufficient labelled data. Furthermore, the…

Machine Learning · Computer Science 2020-01-31 Mridul Mahajan , Tryambak Bhattacharjee , Arya Krishnan , Priya Shukla , G C Nandi

Hand image synthesis and pose estimation from RGB images are both highly challenging tasks due to the large discrepancy between factors of variation ranging from image background content to camera viewpoint. To better analyze these factors…

Computer Vision and Pattern Recognition · Computer Science 2019-04-29 Linlin Yang , Angela Yao

We propose an algorithm, guided variational autoencoder (Guided-VAE), that is able to learn a controllable generative model by performing latent representation disentanglement learning. The learning objective is achieved by providing…

Computer Vision and Pattern Recognition · Computer Science 2020-04-06 Zheng Ding , Yifan Xu , Weijian Xu , Gaurav Parmar , Yang Yang , Max Welling , Zhuowen Tu

Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook. However, they…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Mengqi Huang , Zhendong Mao , Zhuowei Chen , Yongdong Zhang

Variational AutoEncoders (VAEs) provide a means to generate representational latent embeddings. Previous research has highlighted the benefits of achieving representations that are disentangled, particularly for downstream tasks. However,…

Computer Vision and Pattern Recognition · Computer Science 2019-11-18 Matthew J. Vowels , Necati Cihan Camgoz , Richard Bowden

Variational autoencoders (VAEs) are fundamental for generative modeling and image reconstruction, yet their performance often struggles to maintain high fidelity in reconstructions. This study introduces a hybrid model, quantum variational…

Computer Vision and Pattern Recognition · Computer Science 2025-03-10 Farina Riaz , Fakhar Zaman , Hajime Suzuki , Sharif Abuadbba , David Nguyen

Learning deep discrete latent presentations offers a promise of better symbolic and summarized abstractions that are more useful to subsequent downstream tasks. Inspired by the seminal Vector Quantized Variational Auto-Encoder (VQ-VAE),…

Machine Learning · Computer Science 2023-06-21 Tung-Long Vuong , Trung Le , He Zhao , Chuanxia Zheng , Mehrtash Harandi , Jianfei Cai , Dinh Phung

Existing video Variational Autoencoders (VAEs) generally overlook the similarity between frame contents, leading to redundant latent modeling. In this paper, we propose decoupled VAE (DeCo-VAE) to achieve compact latent representation.…

Computer Vision and Pattern Recognition · Computer Science 2025-11-19 Xiangchen Yin , Jiahui Yuan , Zhangchi Hu , Wenzhang Sun , Jie Chen , Xiaozhen Qiao , Hao Li , Xiaoyan Sun

We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher…

Machine Learning · Computer Science 2019-06-04 Ali Razavi , Aaron van den Oord , Oriol Vinyals

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of…

Computer Vision and Pattern Recognition · Computer Science 2022-08-10 Mohammad Adiban , Kalin Stefanov , Sabato Marco Siniscalchi , Giampiero Salvi

Colour controlled image generation and manipulation are of interest to artists and graphic designers. Vector Quantised Variational AutoEncoders (VQ-VAEs) with autoregressive (AR) prior are able to produce high quality images, but lack an…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Keerth Rathakumar , David Liebowitz , Christian Walder , Kristen Moore , Salil S. Kanhere

Constructing a compressed latent space through a variational autoencoder (VAE) is the key for efficient 3D diffusion models. This paper introduces COD-VAE that encodes 3D shapes into a COmpact set of 1D latent vectors without sacrificing…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 In Cho , Youngbeom Yoo , Subin Jeon , Seon Joo Kim

We present a novel unified framework that concurrently tackles recognition and future prediction for human hand pose and action modeling. Previous works generally provide isolated solutions for either recognition or prediction, which not…

Computer Vision and Pattern Recognition · Computer Science 2024-09-10 Yilin Wen , Hao Pan , Takehiko Ohkawa , Lei Yang , Jia Pan , Yoichi Sato , Taku Komura , Wenping Wang

The ability to extract generative parameters from high-dimensional fields of data in an unsupervised manner is a highly desirable yet unrealized goal in computational physics. This work explores the use of variational autoencoders (VAEs)…

Computational Physics · Physics 2021-11-16 Christian Jacobsen , Karthik Duraisamy

Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher…

Computer Vision and Pattern Recognition · Computer Science 2016-11-30 Zhaofan Qiu , Ting Yao , Tao Mei

Models of human motion commonly focus either on trajectory prediction or action classification but rarely both. The marked heterogeneity and intricate compositionality of human motion render each task vulnerable to the data degradation and…

Computer Vision and Pattern Recognition · Computer Science 2022-06-08 Anthony Bourached , Robert Gray , Xiaodong Guan , Ryan-Rhys Griffiths , Ashwani Jha , Parashkev Nachev

Understanding the structure of complex, nonstationary, high-dimensional time-evolving signals is a central challenge in scientific data analysis. In many domains, such as speech and biomedical signal processing, the ability to learn…

Machine Learning · Computer Science 2026-01-13 Ioannis Ziogas , Aamna Al Shehhi , Ahsan H. Khandoker , Leontios J. Hadjileontiadis

Human face exhibits an inherent hierarchy in its representations (i.e., holistic facial expressions can be encoded via a set of facial action units (AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown great results…

Computer Vision and Pattern Recognition · Computer Science 2017-08-08 Dieu Linh Tran , Robert Walecki , Ognjen Rudovic , Stefanos Eleftheriadis , Bjørn Schuller , Maja Pantic

Generating realistic 3D Human-Human Interaction (HHI) requires coherent modeling of the physical plausibility of the agents and their interaction semantics. Existing methods compress all motion information into a single latent…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Zichen Geng , Zeeshan Hayder , Bo Miao , Jian Liu , Wei Liu , Ajmal Mian
‹ Prev 1 2 3 10 Next ›