Related papers: Visually Exploring Multi-Purpose Audio Data

DeepVAT: A Self-Supervised Technique for Cluster Assessment in Image Datasets

Estimating the number of clusters and cluster structures in unlabeled, complex, and high-dimensional datasets (like images) is challenging for traditional clustering algorithms. In recent years, a matrix reordering-based algorithm called…

Machine Learning · Computer Science 2025-11-06 Alokendu Mazumder , Tirthajit Baruah , Akash Kumar Singh , Pagadla Krishna Murthy , Vishwajeet Pattanaik , Punit Rathore

Clustering of Acoustic Environments with Variational Autoencoders for Hearing Devices

Traditional acoustic environment classification relies on: i) classical signal processing algorithms, which are unable to extract meaningful representations of high-dimensional data; or on ii) supervised learning, limited by the…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-22 Luan Vinícius Fiorio , Ivana Nikoloska , Wim van Houtum , Ronald M. Aarts

Labelling unlabelled videos from scratch with multi-modal self-supervision

A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: labelled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image…

Computer Vision and Pattern Recognition · Computer Science 2021-03-02 Yuki M. Asano , Mandela Patrick , Christian Rupprecht , Andrea Vedaldi

Online Deep Clustering with Video Track Consistency

Several unsupervised and self-supervised approaches have been developed in recent years to learn visual features from large-scale unlabeled datasets. Their main drawback however is that these methods are hardly able to recognize visual…

Computer Vision and Pattern Recognition · Computer Science 2022-06-08 Alessandra Alfani , Federico Becattini , Lorenzo Seidenari , Alberto Del Bimbo

Benchmark and application of unsupervised classification approaches for univariate data

Unsupervised machine learning, and in particular data clustering, is a powerful approach for the analysis of datasets and identification of characteristic features occurring throughout a dataset. It is gaining popularity across scientific…

Mesoscale and Nanoscale Physics · Physics 2021-03-23 Maria El Abbassi , Jan Overbeck , Oliver Braun , Michel Calame , Herre S. J. van der Zant , Mickael L. Perrin

ConiVAT: Cluster Tendency Assessment and Clustering with Partial Background Knowledge

The VAT method is a visual technique for determining the potential cluster structure and the possible number of clusters in numerical data. Its improved version, iVAT, uses a path-based distance transform to improve the effectiveness of VAT…

Machine Learning · Computer Science 2020-09-29 Punit Rathore , James C. Bezdek , Paolo Santi , Carlo Ratti

Self-supervised Moving Vehicle Tracking with Stereo Sound

Humans are able to localize objects in the environment using both visual and auditory cues, integrating information from multiple modalities into a common reference frame. We introduce a system that can leverage unlabeled audio-visual data…

Computer Vision and Pattern Recognition · Computer Science 2019-10-28 Chuang Gan , Hang Zhao , Peihao Chen , David Cox , Antonio Torralba

Label-efficient audio classification through multitask learning and self-supervision

While deep learning has been incredibly successful in modeling tasks with large, carefully curated labeled datasets, its application to problems with limited labeled data remains a challenge. The aim of the present work is to improve the…

Audio and Speech Processing · Electrical Eng. & Systems 2019-10-29 Tyler Lee , Ting Gong , Suchismita Padhy , Andrew Rouditchenko , Anthony Ndirango

Learning from Noisy Web Data with Category-level Supervision

As tons of photos are being uploaded to public websites (e.g., Flickr, Bing, and Google) every day, learning from web data has become an increasingly popular research direction because of freely available web resources, which is also…

Computer Vision and Pattern Recognition · Computer Science 2018-05-25 Li Niu , Qingtao Tang , Ashok Veeraraghavan , Ashu Sabharwal

Unsupervised Variational Acoustic Clustering

We propose an unsupervised variational acoustic clustering model for clustering audio data in the time-frequency domain. The model leverages variational inference, extended to an autoencoder framework, with a Gaussian mixture model as a…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-22 Luan Vinícius Fiorio , Bruno Defraene , Johan David , Frans Widdershoven , Wim van Houtum , Ronald M. Aarts

Semi-Supervised Audio Classification with Partially Labeled Data

Audio classification has seen great progress with the increasing availability of large-scale datasets. These large datasets, however, are often only partially labeled as collecting full annotations is a tedious and expensive process. This…

Sound · Computer Science 2021-11-29 Siddharth Gururani , Alexander Lerch

Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision

Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on…

Sound · Computer Science 2019-11-15 Aren Jansen , Daniel P. W. Ellis , Shawn Hershey , R. Channing Moore , Manoj Plakal , Ashok C. Popat , Rif A. Saurous

Supervised Visualization for Data Exploration

Dimensionality reduction is often used as an initial step in data exploration, either as preprocessing for classification or regression or for visualization. Most dimensionality reduction techniques to date are unsupervised; they do not…

Machine Learning · Statistics 2020-06-17 Jake S. Rhodes , Adele Cutler , Guy Wolf , Kevin R. Moon

Learning Visual Styles from Audio-Visual Associations

From the patter of rain to the crunch of snow, the sounds we hear often convey the visual textures that appear within a scene. In this paper, we present a method for learning visual styles from unlabeled audio-visual data. Our model learns…

Computer Vision and Pattern Recognition · Computer Science 2022-05-11 Tingle Li , Yichen Liu , Andrew Owens , Hang Zhao

CULT: Continual Unsupervised Learning with Typicality-Based Environment Detection

We introduce CULT (Continual Unsupervised Representation Learning with Typicality-Based Environment Detection), a new algorithm for continual unsupervised learning with variational auto-encoders. CULT uses a simple typicality metric in the…

Machine Learning · Computer Science 2022-07-19 Oliver Daniels-Koch

A survey on Semi-, Self- and Unsupervised Learning for Image Classification

While deep learning strategies achieve outstanding results in computer vision tasks, one issue remains: The current strategies rely heavily on a huge amount of labeled data. In many real-world problems, it is not feasible to create such an…

Computer Vision and Pattern Recognition · Computer Science 2021-10-14 Lars Schmarje , Monty Santarossa , Simon-Martin Schröder , Reinhard Koch

Variational Self-Supervised Contrastive Learning Using Beta Divergence

Learning a discriminative semantic space using unlabelled and noisy data remains unaddressed in a multi-label setting. We present a contrastive self-supervised learning method which is robust to data noise, grounded in the domain of…

Computer Vision and Pattern Recognition · Computer Science 2024-05-09 Mehmet Can Yavuz , Berrin Yanikoglu

What Are We Really Measuring? Rethinking Dataset Bias in Web-Scale Natural Image Collections via Unsupervised Semantic Clustering

In computer vision, a prevailing method for quantifying dataset bias is to train a model to distinguish between datasets. High classification accuracy is then interpreted as evidence of meaningful semantic differences. This approach assumes…

Computer Vision and Pattern Recognition · Computer Science 2026-04-16 Amir Hossein Saleknia , Mohammad Sabokrou

Fast-VAT: Accelerating Cluster Tendency Visualization using Cython and Numba

Visual Assessment of Cluster Tendency (VAT) is a widely used unsupervised technique to assess the presence of cluster structure in unlabeled datasets. However, its standard implementation suffers from significant performance limitations due…

Machine Learning · Computer Science 2025-07-23 MSR Avinash , Ismael Lachheb

Learning to Localize Sound Sources in Visual Scenes: Analysis and Applications

Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn to correlate the visual scene and sound, as well as localize the sound source only by observing them like humans? To investigate its…

Computer Vision and Pattern Recognition · Computer Science 2019-11-22 Arda Senocak , Tae-Hyun Oh , Junsik Kim , Ming-Hsuan Yang , In So Kweon