Help Creating A Deepfake Audio Dataset?

Hey everyone,

I’m working on building a deepfake audio dataset and wanted to get some help on best practices. I want to ensure that the dataset is diverse and representative for training an effective detection model.

Some questions I have:

How many speakers should I aim for to get a balanced dataset?

Should I maintain an equal gender ratio, or does it make a difference ?

How long is enough from each source(mins, hours)

Any recommended sources or strategies for collecting high-quality real audio?

What sample rates (e.g., 16kHz, 44.1kHz, 48kHz) or a what mix?

Are certain codecs (e.g., MP3, AAC, Opus, WAV) more challenging for detection models?

Would love to hear from those who have experience

submitted by /u/Fuzzy_Cream_5073
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *