Related papers: A sticky HDP-HMM with application to speaker diari…

The Hierarchical Dirichlet Process Hidden Semi-Markov Model

There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the traditional HMM. However, in many settings the HDP-HMM's strict Markovian constraints are…

Machine Learning · Computer Science 2012-03-19 Matthew J. Johnson , Alan Willsky

Speaker Diarization Based on Multi-channel Microphone Array in Small-scale Meeting

In the task of speaker diarization, the number of small-scale meetings accounts for a large proportion. When microphone arrays are employed as a recording device, its spatial information is usually ignored by most researchers. In this…

Sound · Computer Science 2022-10-27 Yuxuan Du , Ruohua Zhou

Bayesian Nonparametric Hidden Semi-Markov Models

There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous Hidden Markov Model for learning from sequential and time-series data. However, in…

Methodology · Statistics 2012-09-11 Matthew J. Johnson , Alan S. Willsky

Disentangled Sticky Hierarchical Dirichlet Process Hidden Markov Model

The Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) has been used widely as a natural Bayesian nonparametric extension of the classical Hidden Markov Model for learning from sequential and time-series data. A sticky extension…

Machine Learning · Statistics 2020-06-23 Ding Zhou , Yuanjun Gao , Liam Paninski

The Recurrent Sticky Hierarchical Dirichlet Process Hidden Markov Model

The Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) is a natural Bayesian nonparametric extension of the classical Hidden Markov Model for learning from (spatio-)temporal data. A sticky HDP-HMM has been proposed to strengthen…

Machine Learning · Computer Science 2024-11-08 Mikołaj Słupiński , Piotr Lipiński

Multi-class Spectral Clustering with Overlaps for Speaker Diarization

This paper describes a method for overlap-aware speaker diarization. Given an overlap detector and a speaker embedding extractor, our method performs spectral clustering of segments informed by the output of the overlap detector. This is…

Audio and Speech Processing · Electrical Eng. & Systems 2020-11-06 Desh Raj , Zili Huang , Sanjeev Khudanpur

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

Speaker diarization, the process of segmenting an audio stream or transcribed speech content into homogenous partitions based on speaker identity, plays a crucial role in the interpretation and analysis of human speech. Most existing…

Machine Learning · Computer Science 2024-08-23 Luyao Cheng , Hui Wang , Siqi Zheng , Yafeng Chen , Rongjie Huang , Qinglin Zhang , Qian Chen , Xihao Li

Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization

Speaker diarization(SD) is a classic task in speech processing and is crucial in multi-party scenarios such as meetings and conversations. Current mainstream speaker diarization approaches consider acoustic information only, which result in…

Computation and Language · Computer Science 2023-05-23 Luyao Cheng , Siqi Zheng , Zhang Qinglin , Hui Wang , Yafeng Chen , Qian Chen

Online speaker diarization of meetings guided by speech separation

Overlapped speech is notoriously problematic for speaker diarization systems. Consequently, the use of speech separation has recently been proposed to improve their performance. Although promising, speech separation models struggle with…

Audio and Speech Processing · Electrical Eng. & Systems 2024-02-02 Elio Gruttadauria , Mathieu Fontaine , Slim Essid

A Real-time Speaker Diarization System Based on Spatial Spectrum

In this paper we describe a speaker diarization system that enables localization and identification of all speakers present in a conversation or meeting. We propose a novel systematic approach to tackle several long-standing challenges in…

Sound · Computer Science 2021-07-21 Siqi Zheng , Weilong Huang , Xianliang Wang , Hongbin Suo , Jinwei Feng , Zhijie Yan

Bayesian Nonparametric Modeling of Driver Behavior using HDP Split-Merge Sampling Algorithm

Modern vehicles are equipped with increasingly complex sensors. These sensors generate large volumes of data that provide opportunities for modeling and analysis. Here, we are interested in exploiting this data to learn aspects of behaviors…

Machine Learning · Statistics 2018-01-30 Vadim Smolyakov , Julian Straub , Sue Zheng , John W. Fisher

Adapting Speaker Embeddings for Speaker Diarisation

The goal of this paper is to adapt speaker embeddings for solving the problem of speaker diarisation. The quality of speaker embeddings is paramount to the performance of speaker diarisation systems. Despite this, prior works in the field…

Audio and Speech Processing · Electrical Eng. & Systems 2021-04-08 Youngki Kwon , Jee-weon Jung , Hee-Soo Heo , You Jin Kim , Bong-Jin Lee , Joon Son Chung

Unsupervised Speaker Diarization in Distributed IoT Networks Using Federated Learning

This paper presents a computationally efficient and distributed speaker diarization framework for networked IoT-style audio devices. The work proposes a Federated Learning model which can identify the participants in a conversation without…

Sound · Computer Science 2024-12-02 Amit Kumar Bhuyan , Hrishikesh Dutta , Subir Biswas

Novel Architectures for Unsupervised Information Bottleneck based Speaker Diarization of Meetings

Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this paper is two-fold: (i) segment initialization by uniformly…

Audio and Speech Processing · Electrical Eng. & Systems 2021-03-05 Nauman Dawalatabad , Srikanth Madikeri , C. Chandra Sekhar , Hema A. Murthy

Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization

Majority of speech signals across different scenarios are never available with well-defined audio segments containing only a single speaker. A typical conversation between two speakers consists of segments where their voices overlap,…

Audio and Speech Processing · Electrical Eng. & Systems 2022-05-20 Siddharth S. Nijhawan , Homayoon Beigi

Triplet Network with Attention for Speaker Diarization

In automatic speech processing systems, speaker diarization is a crucial front-end component to separate segments from different speakers. Inspired by the recent success of deep neural networks (DNNs) in semantic inferencing, triplet…

Audio and Speech Processing · Electrical Eng. & Systems 2018-08-07 Huan Song , Megan Willi , Jayaraman J. Thiagarajan , Visar Berisha , Andreas Spanias

Simultaneous Diarization and Separation of Meetings through the Integration of Statistical Mixture Models

We propose an approach for simultaneous diarization and separation of meeting data. It consists of a complex Angular Central Gaussian Mixture Model (cACGMM) for speech source separation, and a von-Mises-Fisher Mixture Model (VMFMM) for…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-25 Tobias Cord-Landwehr , Christoph Boeddeker , Reinhold Haeb-Umbach

Language Modelling for Speaker Diarization in Telephonic Interviews

The aim of this paper is to investigate the benefit of combining both language and acoustic modelling for speaker diarization. Although conventional systems only use acoustic features, in some scenarios linguistic data contain high…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-31 Miquel India , Javier Hernando , José A. R. Fonollosa

Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakers

We propose a new method for speaker diarization that can handle overlapping speech with 2+ people. Our method is based on compositional embeddings [1]: Like standard speaker embedding methods such as x-vector [2], compositional embedding…

Sound · Computer Science 2021-02-11 Zeqian Li , Jacob Whitehill

Enhancements for Audio-only Diarization Systems

In this paper two different approaches to enhance the performance of the most challenging component of a Speaker Diarization system are presented, i.e. the speaker clustering part. A processing step is proposed enhancing the input features…

Audio and Speech Processing · Electrical Eng. & Systems 2019-09-04 Dimitrios Dimitriadis