Related papers: A quick search method for audio signals based on a…

Rapid solution for searching similar audio items

A naive approach for finding similar audio items would be to compare each entry from the feature vector of the test example with each feature vector of the candidates in a k-nearest neighbors fashion. There are already two problems with…

Sound · Computer Science 2022-01-28 Kastriot Kadriu

ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound

We introduce an audiovisual method for long-range text-to-video retrieval. Unlike previous approaches designed for short video retrieval (e.g., 5-15 seconds in duration), our approach aims to retrieve minute-long videos that capture complex…

Computer Vision and Pattern Recognition · Computer Science 2022-08-03 Yan-Bo Lin , Jie Lei , Mohit Bansal , Gedas Bertasius

Audio Summarization with Audio Features and Probability Distribution Divergence

The automatic summarization of multimedia sources is an important task that facilitates the understanding of an individual by condensing the source while maintaining relevant information. In this paper we focus on audio summarization based…

Computation and Language · Computer Science 2020-04-03 Carlos-Emiliano González-Gallardo , Romain Deveaud , Eric SanJuan , Juan-Manuel Torres-Moreno

Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks

Modern day audio signal classification techniques lack the ability to classify low feature audio signals in the form of spectrographic temporal frequency data representations. Additionally, currently utilized techniques rely on full diverse…

Sound · Computer Science 2024-10-30 Noel Elias

Fast query-by-example speech search using separable model

Traditional Query-by-Example (QbE) speech search approaches usually use methods based on frame-level features, while state-of-the-art approaches tend to use models based on acoustic word embeddings (AWEs) to transform variable length audio…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-21 Yuguang Yang , Yu Pan , Xin Dong , Minqiang Xu

A high speed unsupervised speaker retrieval using vector quantization and second-order statistics

This paper describes an effective unsupervised method for query-by-example speaker retrieval. We suppose that only one speaker is in each audio file or in audio segment. The audio data are modeled using a common universal codebook. The…

Information Retrieval · Computer Science 2010-09-13 Konstantin Biatov

An efficient supervised dictionary learning method for audio signal recognition

Machine hearing or listening represents an emerging area. Conventional approaches rely on the design of handcrafted features specialized to a specific audio task and that can hardly generalized to other audio fields. For example,…

Computer Vision and Pattern Recognition · Computer Science 2018-12-13 Imad Rida , Romain Hérault , Gilles Gasso

Music Genre Classification Using Spectral Analysis and Sparse Representation of the Signals

In this paper, we proposed a robust music genre classification method based on a sparse FFT based feature extraction method which extracted with discriminating power of spectral analysis of non-stationary audio signals, and the capability…

Sound · Computer Science 2018-03-14 Mehdi Banitalebi-Dehkordi , Amin Banitalebi-Dehkordi

Detecting the Trend in Musical Taste over the Decade -- A Novel Feature Extraction Algorithm to Classify Musical Content with Simple Features

This work proposes a novel feature selection algorithm to classify Songs into different groups. Classification of musical content is often a non-trivial job and still relatively less explored area. The main idea conveyed in this article is…

Information Retrieval · Computer Science 2019-01-09 Anish Acharya

Compressive Sampling for the Packet Loss Recovery in Audio Multimedia Streaming

The aim of this paper is to introduce a new schema, based on a Compressive Sampling technique, for the recovery of lost data in multimedia streaming. The audio streaming data are encapsuled in different packets by using an interleaving…

Multimedia · Computer Science 2013-08-21 Angelo Ciaramella , Giulio Giunta

Objective Assessment of Spatial Audio Quality using Directional Loudness Maps

This work introduces a feature extracted from stereophonic/binaural audio signals aiming to represent a measure of perceived quality degradation in processed spatial auditory scenes. The feature extraction technique is based on a simplified…

Audio and Speech Processing · Electrical Eng. & Systems 2022-12-06 Pablo M. Delgado , Jürgen Herre

Statistics-aware Audio-visual Deepfake Detector

In this paper, we propose an enhanced audio-visual deep detection method. Recent methods in audio-visual deepfake detection mostly assess the synchronization between audio and visual features. Although they have shown promising results,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-18 Marcella Astrid , Enjie Ghorbel , Djamila Aouada

Neural Audio Fingerprint for High-specific Audio Retrieval based on Contrastive Learning

Most of existing audio fingerprinting systems have limitations to be used for high-specific audio retrieval at scale. In this work, we generate a low-dimensional representation from a short unit segment of audio, and couple this fingerprint…

Sound · Computer Science 2021-02-11 Sungkyun Chang , Donmoon Lee , Jeongsoo Park , Hyungui Lim , Kyogu Lee , Karam Ko , Yoonchang Han

Automated Detection of Sport Highlights from Audio and Video Sources

This study presents a novel Deep Learning-based and lightweight approach for the automated detection of sports highlights (HLs) from audio and video sources. HL detection is a key task in sports video analysis, traditionally requiring…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Francesco Della Santa , Morgana Lalli

Exploiting Temporal Dependencies for Cross-Modal Music Piece Identification

This paper addresses the problem of cross-modal musical piece identification and retrieval: finding the appropriate recording(s) from a database given a sheet music query, and vice versa, working directly with audio and scanned sheet music…

Audio and Speech Processing · Electrical Eng. & Systems 2021-05-27 Luis Carvalho , Gerhard Widmer

An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection

Partially spoofed audio detection is a challenging task, lying in the need to accurately locate the authenticity of audio at the frame level. To address this issue, we propose a fine-grained partially spoofed audio detection method, namely…

Sound · Computer Science 2023-11-22 Yuankun Xie , Haonan Cheng , Yutian Wang , Long Ye

Unsupervised Feature Learning for Audio Analysis

Identifying acoustic events from a continuously streaming audio source is of interest for many applications including environmental monitoring for basic research. In this scenario neither different event classes are known nor what…

Computer Vision and Pattern Recognition · Computer Science 2017-12-12 Matthias Meyer , Jan Beutel , Lothar Thiele

Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning

In this work, we address the problem of audio-based near-duplicate video retrieval. We propose the Audio Similarity Learning (AuSiL) approach that effectively captures temporal patterns of audio similarity between video pairs. For the…

Multimedia · Computer Science 2021-01-12 Pavlos Avgoustinakis , Giorgos Kordopatis-Zilos , Symeon Papadopoulos , Andreas L. Symeonidis , Ioannis Kompatsiaris

A new heuristic algorithm for fast k-segmentation

The $k$-segmentation of a video stream is used to partition it into $k$ piecewise-linear segments, so that each linear segment has a meaningful interpretation. Such segmentation may be used to summarize large videos using a small set of…

Computer Vision and Pattern Recognition · Computer Science 2020-09-14 Sabarish Vadarevu , Vijay Karamcheti

Multimodal Lengthy Videos Retrieval Framework and Evaluation Metric

Precise video retrieval requires multi-modal correlations to handle unseen vocabulary and scenes, becoming more complex for lengthy videos where models must perform effectively without prior training on a specific dataset. We introduce a…

Computer Vision and Pattern Recognition · Computer Science 2025-04-08 Mohamed Eltahir , Osamah Sarraj , Mohammed Bremoo , Mohammed Khurd , Abdulrahman Alfrihidi , Taha Alshatiri , Mohammad Almatrafi , Tanveer Hussain