English
Related papers

Related papers: Visually Exploring Multi-Purpose Audio Data

200 papers

Estimating the number of clusters and cluster structures in unlabeled, complex, and high-dimensional datasets (like images) is challenging for traditional clustering algorithms. In recent years, a matrix reordering-based algorithm called…

Traditional acoustic environment classification relies on: i) classical signal processing algorithms, which are unable to extract meaningful representations of high-dimensional data; or on ii) supervised learning, limited by the…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-22 Luan Vinícius Fiorio , Ivana Nikoloska , Wim van Houtum , Ronald M. Aarts

A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: labelled data. Yet, labelling a dataset with human annotation continues to carry high costs, especially for videos. While in the image…

Computer Vision and Pattern Recognition · Computer Science 2021-03-02 Yuki M. Asano , Mandela Patrick , Christian Rupprecht , Andrea Vedaldi

Several unsupervised and self-supervised approaches have been developed in recent years to learn visual features from large-scale unlabeled datasets. Their main drawback however is that these methods are hardly able to recognize visual…

Computer Vision and Pattern Recognition · Computer Science 2022-06-08 Alessandra Alfani , Federico Becattini , Lorenzo Seidenari , Alberto Del Bimbo

Unsupervised machine learning, and in particular data clustering, is a powerful approach for the analysis of datasets and identification of characteristic features occurring throughout a dataset. It is gaining popularity across scientific…

Mesoscale and Nanoscale Physics · Physics 2021-03-23 Maria El Abbassi , Jan Overbeck , Oliver Braun , Michel Calame , Herre S. J. van der Zant , Mickael L. Perrin

The VAT method is a visual technique for determining the potential cluster structure and the possible number of clusters in numerical data. Its improved version, iVAT, uses a path-based distance transform to improve the effectiveness of VAT…

Machine Learning · Computer Science 2020-09-29 Punit Rathore , James C. Bezdek , Paolo Santi , Carlo Ratti

Humans are able to localize objects in the environment using both visual and auditory cues, integrating information from multiple modalities into a common reference frame. We introduce a system that can leverage unlabeled audio-visual data…

Computer Vision and Pattern Recognition · Computer Science 2019-10-28 Chuang Gan , Hang Zhao , Peihao Chen , David Cox , Antonio Torralba

While deep learning has been incredibly successful in modeling tasks with large, carefully curated labeled datasets, its application to problems with limited labeled data remains a challenge. The aim of the present work is to improve the…

Audio and Speech Processing · Electrical Eng. & Systems 2019-10-29 Tyler Lee , Ting Gong , Suchismita Padhy , Andrew Rouditchenko , Anthony Ndirango

As tons of photos are being uploaded to public websites (e.g., Flickr, Bing, and Google) every day, learning from web data has become an increasingly popular research direction because of freely available web resources, which is also…

Computer Vision and Pattern Recognition · Computer Science 2018-05-25 Li Niu , Qingtao Tang , Ashok Veeraraghavan , Ashu Sabharwal

We propose an unsupervised variational acoustic clustering model for clustering audio data in the time-frequency domain. The model leverages variational inference, extended to an autoencoder framework, with a Gaussian mixture model as a…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-22 Luan Vinícius Fiorio , Bruno Defraene , Johan David , Frans Widdershoven , Wim van Houtum , Ronald M. Aarts

Audio classification has seen great progress with the increasing availability of large-scale datasets. These large datasets, however, are often only partially labeled as collecting full annotations is a tedious and expensive process. This…

Sound · Computer Science 2021-11-29 Siddharth Gururani , Alexander Lerch

Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on…

Dimensionality reduction is often used as an initial step in data exploration, either as preprocessing for classification or regression or for visualization. Most dimensionality reduction techniques to date are unsupervised; they do not…

Machine Learning · Statistics 2020-06-17 Jake S. Rhodes , Adele Cutler , Guy Wolf , Kevin R. Moon

From the patter of rain to the crunch of snow, the sounds we hear often convey the visual textures that appear within a scene. In this paper, we present a method for learning visual styles from unlabeled audio-visual data. Our model learns…

Computer Vision and Pattern Recognition · Computer Science 2022-05-11 Tingle Li , Yichen Liu , Andrew Owens , Hang Zhao

We introduce CULT (Continual Unsupervised Representation Learning with Typicality-Based Environment Detection), a new algorithm for continual unsupervised learning with variational auto-encoders. CULT uses a simple typicality metric in the…

Machine Learning · Computer Science 2022-07-19 Oliver Daniels-Koch

While deep learning strategies achieve outstanding results in computer vision tasks, one issue remains: The current strategies rely heavily on a huge amount of labeled data. In many real-world problems, it is not feasible to create such an…

Computer Vision and Pattern Recognition · Computer Science 2021-10-14 Lars Schmarje , Monty Santarossa , Simon-Martin Schröder , Reinhard Koch

Learning a discriminative semantic space using unlabelled and noisy data remains unaddressed in a multi-label setting. We present a contrastive self-supervised learning method which is robust to data noise, grounded in the domain of…

Computer Vision and Pattern Recognition · Computer Science 2024-05-09 Mehmet Can Yavuz , Berrin Yanikoglu

In computer vision, a prevailing method for quantifying dataset bias is to train a model to distinguish between datasets. High classification accuracy is then interpreted as evidence of meaningful semantic differences. This approach assumes…

Computer Vision and Pattern Recognition · Computer Science 2026-04-16 Amir Hossein Saleknia , Mohammad Sabokrou

Visual Assessment of Cluster Tendency (VAT) is a widely used unsupervised technique to assess the presence of cluster structure in unlabeled datasets. However, its standard implementation suffers from significant performance limitations due…

Machine Learning · Computer Science 2025-07-23 MSR Avinash , Ismael Lachheb

Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn to correlate the visual scene and sound, as well as localize the sound source only by observing them like humans? To investigate its…

Computer Vision and Pattern Recognition · Computer Science 2019-11-22 Arda Senocak , Tae-Hyun Oh , Junsik Kim , Ming-Hsuan Yang , In So Kweon
‹ Prev 1 2 3 10 Next ›