Benchmarks & EvaluationNon-Commercial

Turkmen Speech Dataset

by rozumov

Bronze44
110.9Kdownloads
0likes
100K<n<1M

Description

Turkmen Speech Dataset (ASR) This dataset contains 251 hours of Turkmen speech audio with transcriptions, intended for training and evaluating Automatic Speech Recognition (ASR) models. It is one of the largest publicly available Turkmen speech datasets. Dataset Overview Property Value Total clips 119,847 Total duration 251.86 hours Sampling rate 16,000 Hz Language Turkmen (tk) Split train Each item includes: audio: waveform + sampling rate text:… See the full description on the dataset page: https://huggingface.co/datasets/rozumov/TurkmenSpeech.

What can I do with this?

Tags

task_categories:automatic-speech-recognitiontask_categories:text-to-speechlanguage:tklicense:cc-by-nc-4.0size_categories:100K<n<1Mregion:us