Skip to content

Libri-Light

Speech RecognitionEnglish

Created by Khan et al. at 2019, the Libri-Light is a speech recognition dataset in English containing 60,000 Hours records in FLAC, JSON format.

About Libri-Light

Dataset contains 60K hours of unlabelled speech from audiobooks in English and a small labelled data set (10h, 1h, and 10 min).

Details

Task
Speech Recognition
Language
English
Format
FLAC, JSON
Rows / instances
60,000 Hours
Creator
Khan et al.
Year
2019
Download Paper

Related Speech Recognition datasets

FAQ