English
Related papers

Related papers: Switched linear encoding with rectified linear aut…

200 papers

We present the discriminative recurrent sparse auto-encoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and…

Machine Learning · Computer Science 2013-03-20 Jason Tyler Rolfe , Yann LeCun

Recently, it has been observed that when representations are learnt in a way that encourages sparsity, improved performance is obtained on classification tasks. These methods involve combinations of activation functions, sampling steps and…

Machine Learning · Computer Science 2014-03-25 Alireza Makhzani , Brendan Frey

Here is presented an analysis of an autoencoder with binary activations $\{0, 1\}$ and binary $\{0, 1\}$ random weights. Such set up puts this model at the intersection of different fields: neuroscience, information theory, sparse coding,…

Machine Learning · Computer Science 2020-05-01 Viacheslav Osaulenko

This paper introduces an efficient and robust method for discovering interpretable circuits in large language models using discrete sparse autoencoders. Our approach addresses key limitations of existing techniques, namely computational…

Computation and Language · Computer Science 2024-05-22 Charles O'Neill , Thang Bui

Auto-Encoders are unsupervised models that aim to learn patterns from observed data by minimizing a reconstruction cost. The useful representations learned are often found to be sparse and distributed. On the other hand, compressed sensing…

Machine Learning · Statistics 2017-07-14 Devansh Arpit , Yingbo Zhou , Hung Q. Ngo , Nils Napp , Venu Govindaraju

A new method for the unsupervised learning of sparse representations using autoencoders is proposed and implemented by ordering the output of the hidden units by their activation value and progressively reconstructing the input in this…

Machine Learning · Computer Science 2016-05-09 Paul Bertens

Sparse autoencoders provide a promising unsupervised approach for extracting interpretable features from a language model by reconstructing activations from a sparse bottleneck layer. Since language models learn many concepts, autoencoders…

Machine Learning · Computer Science 2024-06-07 Leo Gao , Tom Dupré la Tour , Henk Tillman , Gabriel Goh , Rajan Troll , Alec Radford , Ilya Sutskever , Jan Leike , Jeffrey Wu

In this note we present a generative model of natural images consisting of a deep hierarchy of layers of latent random variables, each of which follows a new type of distribution that we call rectified Gaussian. These rectified Gaussian…

Machine Learning · Statistics 2016-03-01 Tim Salimans

Autoencoders are certainly among the most studied and used Deep Learning models: the idea behind them is to train a model in order to reconstruct the same input data. The peculiarity of these models is to compress the information through a…

Machine Learning · Computer Science 2023-09-06 Gabriele Martino , Davide Moroni , Massimo Martinelli

We study how reliably sparse autoencoders (SAEs) support claims about reasoning-related internal features in large language models. We first give a stylized analysis showing that sparsity-regularized decoding can preferentially retain…

Machine Learning · Computer Science 2026-05-19 George Ma , Zhongyuan Liang , Irene Y. Chen , Somayeh Sojoudi

Regularized training of an autoencoder typically results in hidden unit biases that take on large negative values. We show that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input…

Machine Learning · Statistics 2015-04-09 Kishore Konda , Roland Memisevic , David Krueger

Many approaches to transform classification problems from non-linear to linear by feature transformation have been recently presented in the literature. These notably include sparse coding methods and deep neural networks. However, many of…

Machine Learning · Computer Science 2015-07-08 Alessandro Montalto , Giovanni Tessitore , Roberto Prevete

Recent work has found that sparse autoencoders (SAEs) are an effective technique for unsupervised discovery of interpretable features in language models' (LMs) activations, by finding sparse, linear reconstructions of LM activations. We…

Machine Learning · Computer Science 2024-05-01 Senthooran Rajamanoharan , Arthur Conmy , Lewis Smith , Tom Lieberum , Vikrant Varma , János Kramár , Rohin Shah , Neel Nanda

One of the roadblocks to a better understanding of neural networks' internals is \textit{polysemanticity}, where neurons appear to activate in multiple, semantically distinct contexts. Polysemanticity prevents us from identifying concise,…

Machine Learning · Computer Science 2023-10-05 Hoagy Cunningham , Aidan Ewart , Logan Riggs , Robert Huben , Lee Sharkey

Audio pretrained models are widely employed to solve various tasks in speech processing, sound event detection, or music information retrieval. However, the representations learned by these models are unclear, and their analysis mainly…

In this paper we introduce the deep kernelized autoencoder, a neural network model that allows an explicit approximation of (i) the mapping from an input space to an arbitrary, user-specified kernel space and (ii) the back-projection from…

Machine Learning · Statistics 2017-02-09 Michael Kampffmeyer , Sigurd Løkse , Filippo Maria Bianchi , Robert Jenssen , Lorenzo Livi

Is there really much more to say about sparse autoencoders (SAEs)? Autoencoders in general, and SAEs in particular, represent deep architectures that are capable of modeling low-dimensional latent structure in data. Such structure could…

Machine Learning · Computer Science 2025-06-09 Yin Lu , Xuening Zhu , Tong He , David Wipf

The unification of low-level perception and high-level reasoning is a long-standing problem in artificial intelligence, which has the potential to not only bring the areas of logic and learning closer together but also demonstrate how…

Artificial Intelligence · Computer Science 2019-11-27 Anton Fuxjaeger , Vaishak Belle

Sparse autoencoders (SAEs) have lately been used to uncover interpretable latent features in large language models. By projecting dense embeddings into a much higher-dimensional and sparse space, learned features become disentangled and…

Machine Learning · Computer Science 2025-07-30 Viktoria Schuster

We use spatially-sparse two, three and four dimensional convolutional autoencoder networks to model sparse structures in 2D space, 3D space, and 3+1=4 dimensional space-time. We evaluate the resulting latent spaces by testing their…

Computer Vision and Pattern Recognition · Computer Science 2018-11-27 Benjamin Graham
‹ Prev 1 2 3 10 Next ›