Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model
Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model
About this item
Full title
Author / Creator
Publisher
Switzerland: MDPI AG
Journal title
Language
English
Formats
Publication information
Publisher
Switzerland: MDPI AG
Subjects
More information
Scope and Contents
Contents
Speaker diarization systems aim to find 'who spoke when?' in multi-speaker recordings. The dataset usually consists of meetings, TV/talk shows, telephone and multi-party interaction recordings. In this paper, we propose a novel multimodal speaker diarization technique, which finds the active speaker through audio-visual synchronization model for di...
Alternative Titles
Full title
Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model
Authors, Artists and Contributors
Author / Creator
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6929047
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6929047
Other Identifiers
ISSN
1424-8220
E-ISSN
1424-8220
DOI
10.3390/s19235163