The Volcspeech System For The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge
2022 Β· Chen Shen, Yi Liu, Wenzhi Fan, et al.
Abstract
This paper describes our submission to ICASSP 2022 Multi-channel Multi-party Meeting Transcription (M2MeT) Challenge. For Track 1, we propose several approaches to empower the clustering-based speaker diarization system to handle overlapped speech. Front-end dereverberation and the direction-of-arrival (DOA) estimation are used to improve the accuracy of speaker diarization. Multi-channel combination and overlap detection are applied to reduce the missed speaker error. A modified DOVER-Lap is also proposed to fuse the results of different systems. We achieve the final DER of 5.79% on the Eval set and 7.23% on the Test set. For Track 2, we develop our system using the Conformer model in a joint CTC-attention architecture. Serialized output training is adopted to multi-speaker overlapped speech recognition. We propose a neural front-end module to model multi-channel audio and train the model end-to-end. Various data augmentation methods are utilized to mitigate over-fitting in the multi-
Authors
(none)
Tags
Stats
Related papers
- The Ustc-ximalaya System For The ICASSP 2022 Multi-channel Multi-party Meeting Transcription (m2met) Challenge (2022)6.34
- Royalflush Speaker Diarization System For ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (2022)0.00
- The CUHK-TENCENT Speaker Diarization System For The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (2022)7.81
- Summary On The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Challenge (2022)10.35
- The Xmuspeech System For Multi-channel Multi-party Meeting Transcription Challenge (2022)0.00
- Cross-channel Attention-based Target Speaker Voice Activity Detection: Experimental Results For M2met Challenge (2022)10.07
- The Second Multi-channel Multi-party Meeting Transcription Challenge (m2met) 2.0): A Benchmark For Speaker-attributed ASR (2023)6.77
- Microsoft Speaker Diarization System For The Voxceleb Speaker Recognition Challenge 2020 (2020)11.93