I’ve spent a few months running OpenAI Whisper on the available episodes of The Alex Jones show, and was pointed to this subreddit by u/UglyChihuahua. I used the medium English model, as that’s all I had GPU memory for, but used Whisper.cpp and the large model when the medium model got confused.
It’s about 1.2GB of text with timestamps.
I’ve added all the transcripts to a github repository, and also created a simple web site with search, simple stats, and links into the relevant audio clip.
submitted by /u/fudgie
[link] [comments]