The DKU-DukeECE Diarization System for the VoxCeleb Speaker Recognition Challenge 2022

doi:10.48550/arXiv.2210.01677

The DKU-DukeECE Diarization System for the VoxCeleb Speaker Recognition Challenge 2022

This paper discribes the DKU-DukeECE submission to the 4th track of the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22). Our system contains a fused voice activity detection model, a clustering-based diarization model, and a target-speaker voice activity detection-based overlap detection model. Overall, the submitted system is similar to our previous year's system in VoxSRC-21. The difference is that we use a much better speaker embedding and a fused voice activity detection, which significantly improves the performance. Finally, we fuse 4 different systems using DOVER-lap and achieve 4.75 of the diarization error rate, which ranks the 1st place in track 4.

Publication:

arXiv e-prints

Pub Date:

October 2022

DOI:

10.48550/arXiv.2210.01677

arXiv:

arXiv:2210.01677

Bibcode:

2022arXiv221001677W

Keywords:

Electrical Engineering and Systems Science - Audio and Speech Processing

E-Print:

arXiv admin note: substantial text overlap with arXiv:2109.02002

NASA/ADS

The DKU-DukeECE Diarization System for the VoxCeleb Speaker Recognition Challenge 2022

Abstract