Related papers: Point Cloud Audio Processing

Simulating the DFT Algorithm for Audio Processing

Since the evolution of digital computers, the storage of data has always been in terms of discrete bits that can store values of either 1 or 0. Hence, all computer programs (such as MATLAB), convert any input continuous signal into a…

Signal Processing · Electrical Eng. & Systems 2021-05-07 Omkar Deshpande , Kharanshu Solanki , Sree Pujitha Suribhatla , Sanya Zaveri , Luv Ghodasara

Why some audio signal short-time Fourier transform coefficients have nonuniform phase distributions

The short-time Fourier transform (STFT) represents a window of audio samples as a set of complex coefficients. These are advantageously viewed as magnitudes and phases and the overall distribution of phases is very often assumed to be…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-16 Stephen D. Voran

Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks

Recent successful applications of convolutional neural networks (CNNs) to audio classification and speech recognition have motivated the search for better input representations for more efficient training. Visual displays of an audio…

Computer Vision and Pattern Recognition · Computer Science 2017-06-23 M. Huzaifah

PCT: Point cloud transformer

The irregular domain and lack of ordering make it challenging to design deep neural networks for point cloud processing. This paper presents a novel framework named Point Cloud Transformer(PCT) for point cloud learning. PCT is based on…

Computer Vision and Pattern Recognition · Computer Science 2021-06-08 Meng-Hao Guo , Jun-Xiong Cai , Zheng-Ning Liu , Tai-Jiang Mu , Ralph R. Martin , Shi-Min Hu

Employing Discrete Fourier Transform in Representational Learning

Image Representation learning via input reconstruction is a common technique in machine learning for generating representations that can be effectively utilized by arbitrary downstream tasks. A well-established approach is using…

Neural and Evolutionary Computing · Computer Science 2025-06-10 Raoof HojatJalali , Edmondo Trentin

Frequency-Undersampled Short-Time Fourier Transform

The short-time Fourier transform (STFT) usually computes the same number of frequency components as the frame length while overlapping adjacent time frames by more than half. As a result, the number of components of a spectrogram matrix…

Signal Processing · Electrical Eng. & Systems 2020-10-29 Daichi Kitahara

STFTCodec: High-Fidelity Audio Compression through Time-Frequency Domain Representation

We present STFTCodec, a novel spectral-based neural audio codec that efficiently compresses audio using Short-Time Fourier Transform (STFT). Unlike waveform-based approaches that require large model capacity and substantial memory…

Sound · Computer Science 2025-03-24 Tao Feng , Zhiyuan Zhao , Yifan Xie , Yuqi Ye , Xiangyang Luo , Xun Guan , Yu Li

Manifold Fractional Harmonic Transform for 3D Point Clouds

Point clouds can be regarded as discrete samples of smooth manifolds and are typically analyzed via the eigenfunctions of the Laplace-Beltrami operator. This paper extends manifold spectral analysis to the fractional domain, enabling…

General Mathematics · Mathematics 2026-05-04 Jiamian Li , Bing-Zhao Li

Improving Machine Hearing on Limited Data Sets

Convolutional neural network (CNN) architectures have originated and revolutionized machine learning for images. In order to take advantage of CNNs in predictive modeling with audio data, standard FFT-based signal processing methods are…

Sound · Computer Science 2025-02-20 Pavol Harar , Roswitha Bammer , Anna Breger , Monika Dörfler , Zdenek Smekal

Expectation-Maximization for Speech Source Separation Using Convolutive Transfer Function

This paper addresses the problem of under-determinded speech source separation from multichannel microphone singals, i.e. the convolutive mixtures of multiple sources. The time-domain signals are first transformed to the short-time Fourier…

Sound · Computer Science 2019-04-11 Xiaofei Li , Laurent Girin , Radu Horaud

Fractional harmonic transform on point cloud manifolds

Three-dimensional point clouds can be viewed as discrete samples of smooth manifolds, allowing spectral analysis using the Laplace-Beltrami operator (LBO). However, the traditional point cloud manifold harmonic transform (PMHT) is limited…

General Mathematics · Mathematics 2025-10-27 Jiamian Li , Bing-Zhao Li

Bridging Biological Hearing and Neuromorphic Computing: End-to-End Time-Domain Audio Signal Processing with Reservoir Computing

Despite the advancements in cutting-edge technologies, audio signal processing continues to pose challenges and lacks the precision of a human speech processing system. To address these challenges, we propose a novel approach to simplify…

Sound · Computer Science 2026-03-26 Rinku Sebastian , Simon O'Keefe , Martin Trefzer

Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models

In this paper, we propose a deep learning based system for the task of deepfake audio detection. In particular, the draw input audio is first transformed into various spectrograms using three transformation methods of Short-time Fourier…

Sound · Computer Science 2024-07-03 Lam Pham , Phat Lam , Truong Nguyen , Huyen Nguyen , Alexander Schindler

Learnable Adaptive Time-Frequency Representation via Differentiable Short-Time Fourier Transform

The short-time Fourier transform (STFT) is widely used for analyzing non-stationary signals. However, its performance is highly sensitive to its parameters, and manual or heuristic tuning often yields suboptimal results. To overcome this…

Sound · Computer Science 2025-06-27 Maxime Leiber , Yosra Marnissi , Axel Barrau , Sylvain Meignen , Laurent Massoulié

Model-based STFT phase recovery for audio source separation

For audio source separation applications, it is common to estimate the magnitude of the short-time Fourier transform (STFT) of each source. In order to further synthesizing time-domain signals, it is necessary to recover the phase of the…

Sound · Computer Science 2018-02-28 Paul Magron , Roland Badeau , Bertrand David

Pointfilter: Point Cloud Filtering via Encoder-Decoder Modeling

Point cloud filtering is a fundamental problem in geometry modeling and processing. Despite of significant advancement in recent years, the existing methods still suffer from two issues: 1) they are either designed without preserving sharp…

Graphics · Computer Science 2020-09-29 Dongbo Zhang , Xuequan Lu , Hong Qin , Ying He

Audio Source Separation with Discriminative Scattering Networks

In this report we describe an ongoing line of research for solving single-channel source separation problems. Many monaural signal decomposition techniques proposed in the literature operate on a feature space consisting of a time-frequency…

Sound · Computer Science 2015-04-29 Pablo Sprechmann , Joan Bruna , Yann LeCun

PU-Transformer: Point Cloud Upsampling Transformer

Given the rapid development of 3D scanners, point clouds are becoming popular in AI-driven machines. However, point cloud data is inherently sparse and irregular, causing significant difficulties for machine perception. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2022-10-04 Shi Qiu , Saeed Anwar , Nick Barnes

Deep Learning for Audio Signal Processing

Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered…

Sound · Computer Science 2019-05-28 Hendrik Purwins , Bo Li , Tuomas Virtanen , Jan Schlüter , Shuo-yiin Chang , Tara Sainath

Computer Audition: From Task-Specific Machine Learning to Foundation Models

Foundation models (FMs) are increasingly spearheading recent advances on a variety of tasks that fall under the purview of computer audition -- the use of machines to understand sounds. They feature several advantages over traditional…

Sound · Computer Science 2025-07-29 Andreas Triantafyllopoulos , Iosif Tsangko , Alexander Gebhard , Annamaria Mesaros , Tuomas Virtanen , Björn Schuller