Related papers: Glottal Source Estimation using an Automatic Chirp…
Homomorphic analysis is a well-known method for the separation of non-linearly combined signals. More particularly, the use of complex cepstrum for source-tract deconvolution has been discussed in various articles. However there exists no…
Complex cepstrum is known in the literature for linearly separating causal and anticausal components. Relying on advances achieved by the Zeros of the Z-Transform (ZZT) technique, we here investigate the possibility of using complex…
This paper addresses the problem of estimating the voice source directly from speech waveforms. A novel principle based on Anticausality Dominated Regions (ACDR) is used to estimate the glottal open phase. This technique is compared to two…
Some glottal analysis approaches based upon linear prediction or complex cepstrum approaches have been proved to be effective to estimate glottal source from real speech utterances. We propose a new approach employing both an all-pole…
Source-tract decomposition (or glottal flow estimation) is one of the basic problems of speech processing. For this, several techniques have been proposed in the literature. However studies comparing different approaches are almost…
It was recently shown that complex cepstrum can be effectively used for glottal flow estimation by separating the causal and anticausal components of speech. In order to guarantee a correct estimation, some constraints on the window have…
Automatic detection of voice pathology enables objective assessment and earlier intervention for the diagnosis. This study provides a systematic analysis of glottal source features and investigates their effectiveness in voice pathology…
The pseudo-periodicity of voiced speech can be exploited in several speech processing applications. This requires however that the precise locations of the Glottal Closure Instants (GCIs) are available. The focus of this paper is the…
Glottal Closure Instants (GCI) detection consists in automatically detecting temporal locations of most significant excitation of the vocal tract from the speech signal. It is used in many speech analysis and processing applications, and…
In this paper, we propose a classification based glottal closure instants (GCI) detection from pathological acoustic speech signal, which finds many applications in vocal disorder analysis. Till date, GCI for pathological disorder is…
Glottal Closure Instants (GCIs) correspond to the temporal locations of significant excitation to the vocal tract occurring during the production of voiced speech. GCI detection from speech signals is a well-studied problem given its…
The Z Transform is a mathematical operation in signal processing, which gives a tractable way to solve linear, constant-coefficient difference equations. Based on the classical Z transform and inspired by the thought of sliding DFT, a new…
This paper studies the problem of Generalized Zero-shot Learning (G-ZSL), whose goal is to classify instances belonging to both seen and unseen classes at the test time. We propose a novel space decomposition method to solve G-ZSL. Some…
This paper addresses the problem of automatic detection of voice pathologies directly from the speech signal. For this, we investigate the use of the glottal source estimation as a means to detect voice disorders. Three sets of features are…
The Glottal Source is an important component of voice as it can be considered as the excitation signal to the voice apparatus. Nowadays, new techniques of speech processing such as speech recognition and speech synthesis use the glottal…
A novel feature, based on the chirp z-transform, that offers an improved representation of the underlying true spectrum is proposed. This feature, the chirp MFCC, is derived by computing the Mel frequency cepstral coefficients from the…
In recent years there has been a growing interest in the fractional Fourier transform driven by its large number of applications. The literature in this field follows two main routes. On the one hand, the areas where the ordinary Fourier…
In some of the problems, complicated functions of the Z-transform variable, $z$, appear which either cannot be inverted analytically or the required calculations are quite tedious. In such cases numerical methods should be used to find the…
Zero-shot composed image retrieval (ZS-CIR) retrieves a target image from a reference image and a text modification without human-annotated CIR triplets. Projection-based ZS-CIR methods are attractive because they do not rely on LLMs at…
Articulatory features can provide interpretable and flexible controls for the synthesis of human vocalizations by allowing the user to directly modify parameters like vocal strain or lip position. To make this manipulation through…