Related papers: Swapping Autoencoder for Deep Image Manipulation

Modelling nonlinear dependencies in the latent space of inverse scattering

The problem of inverse scattering proposed by Angles and Mallat in 2018, concerns training a deep neural network to invert the scattering transform applied to an image. After such a network is trained, it can be used as a generative model…

Computer Vision and Pattern Recognition · Computer Science 2022-03-22 Juliusz Ziomek , Katayoun Farrahi

Entroformer: A Transformer-based Entropy Model for Learned Image Compression

One critical component in lossy deep image compression is the entropy model, which predicts the probability distribution of the quantized latent representation in the encoding and decoding modules. Previous works build entropy models upon…

Image and Video Processing · Electrical Eng. & Systems 2023-03-16 Yichen Qian , Ming Lin , Xiuyu Sun , Zhiyu Tan , Rong Jin

Efficient Latent Representations using Multiple Tasks for Autonomous Driving

Driving in the dynamic, multi-agent, and complex urban environment is a difficult task requiring a complex decision policy. The learning of such a policy requires a state representation that can encode the entire environment. Mid-level…

Robotics · Computer Science 2020-03-03 Eshagh Kargar , Ville Kyrki

Free-Form Image Inpainting via Contrastive Attention Network

Most deep learning based image inpainting approaches adopt autoencoder or its variants to fill missing regions in images. Encoders are usually utilized to learn powerful representational spaces, which are important for dealing with…

Computer Vision and Pattern Recognition · Computer Science 2020-10-30 Xin Ma , Xiaoqiang Zhou , Huaibo Huang , Zhenhua Chai , Xiaolin Wei , Ran He

The Diffusion Encoder

We construct a new kind of encoder, leveraging the expressive power of diffusion models. In a traditional variational autoencoder, the encoder and decoder jointly negotiate a latent representation of the input. This is made possible by the…

Machine Learning · Computer Science 2026-05-14 Akhil Premkumar , Sarah Lucioni

Discriminative Autoencoder for Feature Extraction: Application to Character Recognition

Conventionally, autoencoders are unsupervised representation learning tools. In this work, we propose a novel discriminative autoencoder. Use of supervised discriminative learning ensures that the learned representation is robust to…

Computer Vision and Pattern Recognition · Computer Science 2019-12-30 Anupriya Gogna , Angshul Majumdar

Processsing Simple Geometric Attributes with Autoencoders

Image synthesis is a core problem in modern deep learning, and many recent architectures such as autoencoders and Generative Adversarial networks produce spectacular results on highly complex data, such as images of faces or landscapes.…

Computer Vision and Pattern Recognition · Computer Science 2019-04-16 Alasdair Newson , Andrés Almansa , Yann Gousseau , Saïd Ladjal

Detecting AutoEncoder is Enough to Catch LDM Generated Images

In recent years, diffusion models have become one of the main methods for generating images. However, detecting images generated by these models remains a challenging task. This paper proposes a novel method for detecting images generated…

Computer Vision and Pattern Recognition · Computer Science 2024-11-12 Dmitry Vesnin , Dmitry Levshun , Andrey Chechulin

Grammar Variational Autoencoder

Deep generative models have been wildly successful at learning coherent latent representations for continuous data such as video and audio. However, generative modeling of discrete data such as arithmetic expressions and molecular…

Machine Learning · Statistics 2017-03-07 Matt J. Kusner , Brooks Paige , José Miguel Hernández-Lobato

Image Shape Manipulation from a Single Augmented Training Sample

In this paper, we present DeepSIM, a generative model for conditional image manipulation based on a single image. We find that extensive augmentation is key for enabling single image training, and incorporate the use of thin-plate-spline…

Computer Vision and Pattern Recognition · Computer Science 2021-11-29 Yael Vinker , Eliahu Horwitz , Nir Zabari , Yedid Hoshen

Image Shape Manipulation from a Single Augmented Training Sample

In this paper, we present DeepSIM, a generative model for conditional image manipulation based on a single image. We find that extensive augmentation is key for enabling single image training, and incorporate the use of thin-plate-spline…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Yael Vinker , Eliahu Horwitz , Nir Zabari , Yedid Hoshen

Diverse Semantic Image Editing with Style Codes

Semantic image editing requires inpainting pixels following a semantic map. It is a challenging task since this inpainting requires both harmony with the context and strict compliance with the semantic maps. The majority of the previous…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Hakan Sivuk , Aysegul Dundar

An End-to-End Block Autoencoder For Physical Layer Based On Neural Networks

Deep Learning has been widely applied in the area of image processing and natural language processing. In this paper, we propose an end-to-end communication structure based on autoencoder where the transceiver can be optimized jointly. A…

Information Theory · Computer Science 2019-06-18 Tianjie Mu , Xiaohui Chen , Li Chen , Huarui Yin , Weidong Wang

Deep Generation of Face Images from Sketches

Recent deep image-to-image translation techniques allow fast generation of face images from freehand sketches. However, existing solutions tend to overfit to sketches, thus requiring professional sketches or even edge maps as input. To…

Graphics · Computer Science 2020-06-09 Shu-Yu Chen , Wanchao Su , Lin Gao , Shihong Xia , Hongbo Fu

A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction

Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human…

Computer Vision and Pattern Recognition · Computer Science 2021-10-27 Gamze Akyol , Sanem Sariel , Eren Erdal Aksoy

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

The field of advanced text-to-image generation is witnessing the emergence of unified frameworks that integrate powerful text encoders, such as CLIP and T5, with Diffusion Transformer backbones. Although there have been efforts to control…

Computer Vision and Pattern Recognition · Computer Science 2025-02-28 Liang Chen , Shuai Bai , Wenhao Chai , Weichu Xie , Haozhe Zhao , Leon Vinci , Junyang Lin , Baobao Chang

Unsupervised Representation Learning with Laplacian Pyramid Auto-encoders

Scale-space representation has been popular in computer vision community due to its theoretical foundation. The motivation for generating a scale-space representation of a given data set originates from the basic observation that real-world…

Computer Vision and Pattern Recognition · Computer Science 2018-05-15 Qilu Zhao , Zongmin Li

Decoupling Global and Local Representations via Invertible Generative Flows

In this work, we propose a new generative model that is capable of automatically decoupling global and local representations of images in an entirely unsupervised setting, by embedding a generative flow in the VAE framework to model the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-17 Xuezhe Ma , Xiang Kong , Shanghang Zhang , Eduard Hovy

Auto-encoding Molecules: Graph-Matching Capabilities Matter

Autoencoders are effective deep learning models that can function as generative models and learn latent representations for downstream tasks. The use of graph autoencoders - with both encoder and decoder implemented as message passing…

Machine Learning · Computer Science 2025-03-04 Magnus Cunow , Gerrit Großmann

Real-time Virtual-Try-On from a Single Example Image through Deep Inverse Graphics and Learned Differentiable Renderers

Augmented reality applications have rapidly spread across online platforms, allowing consumers to virtually try-on a variety of products, such as makeup, hair dying, or shoes. However, parametrizing a renderer to synthesize realistic images…

Computer Vision and Pattern Recognition · Computer Science 2022-05-16 Robin Kips , Ruowei Jiang , Sileye Ba , Brendan Duke , Matthieu Perrot , Pietro Gori , Isabelle Bloch