Speech datasets obtained from the Living Audio Datasets project:
-
Irish
Source: audio, transcripts
Original data: audio (273MB), transcripts
Coqui STT (95MB)
Festvox (95MB)
The convert.sh scripts converts the original data into Festvox and Coqui STT (uses wai.annotations for the latter)
License