Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation
Abstract
This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2024) for Irish-to-English speech translation. We built end-to-end systems based on Whisper, and employed a number of data augmentation techniques, such as speech back-translation and noise augmentation. We investigate the effect of using synthetic audio data and discuss several methods for enriching signal diversity.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2024
- DOI:
- 10.48550/arXiv.2406.17363
- arXiv:
- arXiv:2406.17363
- Bibcode:
- 2024arXiv240617363M
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Sound;
- Electrical Engineering and Systems Science - Audio and Speech Processing
- E-Print:
- IWSLT 2024