Free Cross-Lingual Acoustic Feature Database For Tabular ML And Emotion Recognition

So I have a free to use 7 language macro prosody samole pack for the community to play with. I’d love feedback. No audio, voice telemetry on 7 languages, normalized, graded. Good to help make emotive TTS or benchmark less common languages, cross linguisic comparion etc.

90+ languages available for possible licensing.

https://huggingface.co/datasets/vadette/macro_prosody_sample_set

This pack was selected to span typologically distinct language families and speech types:

Korean is a language isolate with phrase-final focus marking and complex mora timing — a useful contrast to the stress-timed Indo-Aryan languages.

Hindi is the largest corpus here and provides strong statistical power for Indo-Aryan prosody baselines.

Hebrew is a VSO Semitic language with root-and-pattern morphology; the high metadata coverage makes it useful for demographic-stratified analyses.

Manx is a Celtic revival language with a tiny native speaker community. The 98% PRISTINE rate reflects the controlled recording conditions of motivated community contributors.

Tzeltal is a Mayan language with ergative-absolutive alignment and a distinctive tonal register system. It is rarely represented in acoustic datasets.

Maguindanao (SPS2) is spontaneous speech from a Philippine Austronesian language. The T2-heavy distribution reflects the naturalistic recording conditions of the SPS2 corpus.

Lasi (SPS2) is a Sindhi variety spoken in Balochistan. Shorter median clip duration (3.4s vs 5–6s for CV24 languages) reflects the spontaneous speech format.

submitted by /u/Wooden_Leek_7258
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *