Hi fellow redditors,
I’m working on a mini-project where I want to build an ASMR text-to-speech model. Due to the lack of ASMR datasets available, I went on to build a small audio dataset from youtube videos (about 85 videos large, wav extracted) and downloaded their transcript or STT using Whisper. But during training, a lot of errors popped up due to variation in size, bitrate, sample rate, etc.
I’d be grateful if you could point me to any existing ASMR dataset with high/medium quality small audio files (<1min) along with text transcripts.
submitted by /u/Available-Deer1723
[link] [comments]