Log in to save to my catalogue

Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model

Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6929047

Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model

About this item

Full title

Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model

Publisher

Switzerland: MDPI AG

Journal title

Sensors (Basel, Switzerland), 2019-11, Vol.19 (23), p.5163

Language

English

Formats

Publication information

Publisher

Switzerland: MDPI AG

More information

Scope and Contents

Contents

Speaker diarization systems aim to find 'who spoke when?' in multi-speaker recordings. The dataset usually consists of meetings, TV/talk shows, telephone and multi-party interaction recordings. In this paper, we propose a novel multimodal speaker diarization technique, which finds the active speaker through audio-visual synchronization model for di...

Alternative Titles

Full title

Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6929047

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6929047

Other Identifiers

ISSN

1424-8220

E-ISSN

1424-8220

DOI

10.3390/s19235163

How to access this item