Related papers: Positional Encoding as Spatial Inductive Bias in G…

Toward Spatially Unbiased Generative Models

Recent image generation models show remarkable generation performance. However, they mirror strong location preference in datasets, which we call spatial bias. Therefore, generators render poor samples at unseen locations and scales. We…

Machine Learning · Computer Science 2021-08-04 Jooyoung Choi , Jungbeom Lee , Yonghyun Jeong , Sungroh Yoon

Designing an Encoder for StyleGAN Image Manipulation

Recently, there has been a surge of diverse methods for performing image editing by employing pre-trained unconditional generators. Applying these methods on real images, however, remains a challenge, as it necessarily requires the…

Computer Vision and Pattern Recognition · Computer Science 2021-02-05 Omer Tov , Yuval Alaluf , Yotam Nitzan , Or Patashnik , Daniel Cohen-Or

High-fidelity GAN Inversion with Padding Space

Inverting a Generative Adversarial Network (GAN) facilitates a wide range of image editing tasks using pre-trained generators. Existing methods typically employ the latent space of GANs as the inversion space yet observe the insufficient…

Computer Vision and Pattern Recognition · Computer Science 2022-07-28 Qingyan Bai , Yinghao Xu , Jiapeng Zhu , Weihao Xia , Yujiu Yang , Yujun Shen

Local Padding in Patch-Based GANs for Seamless Infinite-Sized Texture Synthesis

Texture models based on Generative Adversarial Networks (GANs) use zero-padding to implicitly encode positional information of the image features. However, when extending the spatial input to generate images at large sizes, zero-padding can…

Computer Vision and Pattern Recognition · Computer Science 2024-11-08 Alhasan Abdellatif , Ahmed H. Elsheikh , Hannah P. Menke

SinGAN: Learning a Generative Model from a Single Natural Image

We introduce SinGAN, an unconditional generative model that can be learned from a single natural image. Our model is trained to capture the internal distribution of patches within the image, and is then able to generate high quality,…

Computer Vision and Pattern Recognition · Computer Science 2019-09-06 Tamar Rott Shaham , Tali Dekel , Tomer Michaeli

SC2GAN: Rethinking Entanglement by Self-correcting Correlated GAN Space

Generative Adversarial Networks (GANs) can synthesize realistic images, with the learned latent space shown to encode rich semantic information with various interpretable directions. However, due to the unstructured nature of the learned…

Computer Vision and Pattern Recognition · Computer Science 2023-10-11 Zikun Chen , Han Zhao , Parham Aarabi , Ruowei Jiang

An Empirical Study of Generative Models with Encoders

Generative adversarial networks (GANs) are capable of producing high quality image samples. However, unlike variational autoencoders (VAEs), GANs lack encoders that provide the inverse mapping for the generators, i.e., encode images back to…

Machine Learning · Statistics 2018-12-20 Paul K. Rubenstein , Yunpeng Li , Dominik Roblek

AE-StyleGAN: Improved Training of Style-Based Auto-Encoders

StyleGANs have shown impressive results on data generation and manipulation in recent years, thanks to its disentangled style latent space. A lot of efforts have been made in inverting a pretrained generator, where an encoder is trained ad…

Computer Vision and Pattern Recognition · Computer Science 2021-10-19 Ligong Han , Sri Harsha Musunuri , Martin Renqiang Min , Ruijiang Gao , Yu Tian , Dimitris Metaxas

SIGN: Spatial-information Incorporated Generative Network for Generalized Zero-shot Semantic Segmentation

Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction…

Computer Vision and Pattern Recognition · Computer Science 2021-08-31 Jiaxin Cheng , Soumyaroop Nandi , Prem Natarajan , Wael Abd-Almageed

A 2D Semantic-Aware Position Encoding for Vision Transformers

Vision transformers have demonstrated significant advantages in computer vision tasks due to their ability to capture long-range dependencies and contextual relationships through self-attention. However, existing position encoding…

Computer Vision and Pattern Recognition · Computer Science 2025-05-15 Xi Chen , Shiyang Zhou , Muqi Huang , Jiaxu Feng , Yun Xiong , Kun Zhou , Biao Yang , Yuhui Zhang , Huishuai Bao , Sijia Peng , Chuan Li , Feng Shi

Deconstructing Positional Information: From Attention Logits to Training Biases

Positional encodings enable Transformers to incorporate sequential information, yet their theoretical understanding remains limited to two properties: distance attenuation and translation invariance. Because natural language lacks purely…

Machine Learning · Computer Science 2026-02-11 Zihan Gu , Ruoyu Chen , Han Zhang , Hua Zhang , Yue Hu

Style Intervention: How to Achieve Spatial Disentanglement with Style-based Generators?

Generative Adversarial Networks (GANs) with style-based generators (e.g. StyleGAN) successfully enable semantic control over image synthesis, and recent studies have also revealed that interpretable image translations could be obtained by…

Computer Vision and Pattern Recognition · Computer Science 2020-11-20 Yunfan Liu , Qi Li , Zhenan Sun , Tieniu Tan

Spatial Latent Representations in Generative Adversarial Networks for Image Generation

In the majority of GAN architectures, the latent space is defined as a set of vectors of given dimensionality. Such representations are not easily interpretable and do not capture spatial information of image content directly. In this work,…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Maciej Sypetkowski

InvGAN: Invertible GANs

Generation of photo-realistic images, semantic editing and representation learning are a few of many potential applications of high resolution generative models. Recent progress in GANs have established them as an excellent choice for such…

Computer Vision and Pattern Recognition · Computer Science 2021-12-13 Partha Ghosh , Dominik Zietlow , Michael J. Black , Larry S. Davis , Xiaochen Hu

Spatial Frequency Bias in Convolutional Generative Adversarial Networks

As the success of Generative Adversarial Networks (GANs) on natural images quickly propels them into various real-life applications across different domains, it becomes more and more important to clearly understand their limitations.…

Machine Learning · Computer Science 2020-12-21 Mahyar Khayatkhoei , Ahmed Elgammal

What DINO saw: ALiBi positional encoding reduces positional bias in Vision Transformers

Vision transformers (ViTs) - especially feature foundation models like DINOv2 - learn rich representations useful for many downstream tasks. However, architectural choices (such as positional encoding) can lead to these models displaying…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Moritz Pawlowsky , Antonis Vamvakeros , Alexander Weiss , Anja Bielefeld , Samuel J. Cooper , Ronan Docherty

Feature-Style Encoder for Style-Based GAN Inversion

We propose a novel architecture for GAN inversion, which we call Feature-Style encoder. The style encoder is key for the manipulation of the obtained latent codes, while the feature encoder is crucial for optimal image reconstruction. Our…

Computer Vision and Pattern Recognition · Computer Science 2022-02-07 Xu Yao , Alasdair Newson , Yann Gousseau , Pierre Hellier

Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images

3D GAN inversion aims to project a single image into the latent space of a 3D Generative Adversarial Network (GAN), thereby achieving 3D geometry reconstruction. While there exist encoders that achieve good results in 3D GAN inversion, they…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Bahri Batuhan Bilecen , Ahmet Berke Gokmen , Aysegul Dundar

Spatially-Adaptive Hash Encodings For Neural Surface Reconstruction

Positional encodings are a common component of neural scene reconstruction methods, and provide a way to bias the learning of neural fields towards coarser or finer representations. Current neural surface reconstruction methods use a…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Thomas Walker , Octave Mariotti , Amir Vaxman , Hakan Bilen

Training-free Spatially Grounded Geometric Shape Encoding (Technical Report)

Positional encoding has become the de facto standard for grounding deep neural networks on discrete point-wise positions, and it has achieved remarkable success in tasks where the input can be represented as a one-dimensional sequence.…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Yuhang He