Related papers: Variable Length Embeddings

Variational Autoencoder with Learned Latent Structure

The manifold hypothesis states that high-dimensional data can be modeled as lying on or near a low-dimensional, nonlinear manifold. Variational Autoencoders (VAEs) approximate this manifold by learning mappings from low-dimensional latent…

Machine Learning · Statistics 2021-03-03 Marissa C. Connor , Gregory H. Canal , Christopher J. Rozell

Variational Lossy Autoencoder

Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only…

Machine Learning · Computer Science 2017-03-07 Xi Chen , Diederik P. Kingma , Tim Salimans , Yan Duan , Prafulla Dhariwal , John Schulman , Ilya Sutskever , Pieter Abbeel

Improving Textual Network Learning with Variational Homophilic Embeddings

The performance of many network learning applications crucially hinges on the success of network embedding algorithms, which aim to encode rich network information into low-dimensional vertex-based vector representations. This paper…

Machine Learning · Computer Science 2019-10-01 Wenlin Wang , Chenyang Tao , Zhe Gan , Guoyin Wang , Liqun Chen , Xinyuan Zhang , Ruiyi Zhang , Qian Yang , Ricardo Henao , Lawrence Carin

Training VAEs Under Structured Residuals

Variational auto-encoders (VAEs) are a popular and powerful deep generative model. Previous works on VAEs have assumed a factorized likelihood model, whereby the output uncertainty of each pixel is assumed to be independent. This…

Machine Learning · Statistics 2026-05-14 Gara Dorta , Sara Vicente , Lourdes Agapito , Neill D. F. Campbell , Ivor Simpson

Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies

Intelligent behaviour in the real-world requires the ability to acquire new knowledge from an ongoing sequence of experiences while preserving and reusing past knowledge. We propose a novel algorithm for unsupervised representation learning…

Machine Learning · Computer Science 2018-08-21 Alessandro Achille , Tom Eccles , Loic Matthey , Christopher P. Burgess , Nick Watters , Alexander Lerchner , Irina Higgins

Wavelet-based Variational Autoencoders for High-Resolution Image Generation

Variational Autoencoders (VAEs) are powerful generative models capable of learning compact latent representations. However, conventional VAEs often generate relatively blurry images due to their assumption of an isotropic Gaussian latent…

Computer Vision and Pattern Recognition · Computer Science 2025-04-21 Andrew Kiruluta

Wavelets to the Rescue: Improving Sample Quality of Latent Variable Deep Generative Models

Variational Autoencoders (VAE) are probabilistic deep generative models underpinned by elegant theory, stable training processes, and meaningful manifold representations. However, they produce blurry images due to a lack of explicit…

Computer Vision and Pattern Recognition · Computer Science 2019-11-15 Prashnna K Gyawali , Rudra Saha , Linwei Wang , VSR Veeravasarapu , Maneesh Singh

Variational Autoencoders for Deforming 3D Mesh Models

3D geometric contents are becoming increasingly popular. In this paper, we study the problem of analyzing deforming 3D meshes using deep neural networks. Deforming 3D meshes are flexible to represent 3D animation sequences as well as…

Graphics · Computer Science 2018-03-30 Qingyang Tan , Lin Gao , Yu-Kun Lai , Shihong Xia

Deep Variational Inference Without Pixel-Wise Reconstruction

Variational autoencoders (VAEs), that are built upon deep neural networks have emerged as popular generative models in computer vision. Most of the work towards improving variational autoencoders has focused mainly on making the…

Machine Learning · Statistics 2016-11-17 Siddharth Agrawal , Ambedkar Dukkipati

DVE: Dynamic Variational Embeddings with Applications in Recommender Systems

Embedding is a useful technique to project a high-dimensional feature into a low-dimensional space, and it has many successful applications including link prediction, node classification and natural language processing. Current approaches…

Information Retrieval · Computer Science 2020-09-21 Meimei Liu , Hongxia Yang

Dense Sample Deep Learning

Deep Learning (DL) , a variant of the neural network algorithms originally proposed in the 1980s, has made surprising progress in Artificial Intelligence (AI), ranging from language translation, protein folding, autonomous cars, and more…

Artificial Intelligence · Computer Science 2023-07-24 Stephen Josè Hanson , Vivek Yadav , Catherine Hanson

WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM

While embeddings from multimodal large language models (LLMs) excel as general-purpose representations, their application to dynamic modalities like audio and video remains underexplored. We introduce WAVE (\textbf{u}nified \&…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Changli Tang , Qinfan Xiao , Ke Mei , Tianyi Wang , Fengyun Rao , Chao Zhang

VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks

Embedding models have been crucial in enabling various downstream tasks such as semantic similarity, information retrieval, and clustering. Recently, there has been a surge of interest in developing universal text embedding models that can…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Ziyan Jiang , Rui Meng , Xinyi Yang , Semih Yavuz , Yingbo Zhou , Wenhu Chen

From Visuals to Vocabulary: Establishing Equivalence Between Image and Text Token Through Autoregressive Pre-training in MLLMs

While MLLMs perform well on perceptual tasks, they lack precise multimodal alignment, limiting performance. To address this challenge, we propose Vision Dynamic Embedding-Guided Pretraining (VDEP), a hybrid autoregressive training paradigm…

Computer Vision and Pattern Recognition · Computer Science 2025-02-14 Mingxiao Li , Fang Qu , Zhanpeng Chen , Na Su , Zhizhou Zhong , Ziyang Chen , Nan Du , Xiaolong Li

Multi-modal Auto-regressive Modeling via Visual Words

Large Language Models (LLMs), benefiting from the auto-regressive modelling approach performed on massive unannotated texts corpora, demonstrates powerful perceptual and reasoning capabilities. However, as for extending auto-regressive…

Computer Vision and Pattern Recognition · Computer Science 2024-09-24 Tianshuo Peng , Zuchao Li , Lefei Zhang , Hai Zhao , Ping Wang , Bo Du

Learning to Recall with Transformers Beyond Orthogonal Embeddings

Modern large language models (LLMs) excel at tasks that require storing and retrieving knowledge, such as factual recall and question answering. Transformers are central to this capability because they can encode information during training…

Machine Learning · Statistics 2026-03-18 Nuri Mert Vural , Alberto Bietti , Mahdi Soltanolkotabi , Denny Wu

A Novel Convolutional Neural Network Architecture with a Continuous Symmetry

This paper introduces a new Convolutional Neural Network (ConvNet) architecture inspired by a class of partial differential equations (PDEs) called quasi-linear hyperbolic systems. With comparable performance on the image classification…

Computer Vision and Pattern Recognition · Computer Science 2024-05-21 Yao Liu , Hang Shao , Bing Bai

Joint Embedding Variational Bayes

We introduce Variational Joint Embedding (VJE), a reconstruction-free latent-variable framework for non-contrastive self-supervised learning in representation space. VJE maximizes a symmetric conditional evidence lower bound (ELBO) on…

Machine Learning · Computer Science 2026-04-27 Amin Oji , Paul Fieguth

Disentangling Variational Autoencoders

A variational autoencoder (VAE) is a probabilistic machine learning framework for posterior inference that projects an input set of high-dimensional data to a lower-dimensional, latent space. The latent space learned with a VAE offers…

Machine Learning · Computer Science 2022-11-16 Rafael Pastrana

Latent Convolutional Models

We present a new latent model of natural images that can be learned on large-scale datasets. The learning process provides a latent embedding for every image in the training dataset, as well as a deep convolutional network that maps the…

Computer Vision and Pattern Recognition · Computer Science 2018-11-06 ShahRukh Athar , Evgeny Burnaev , Victor Lempitsky