{"id":41398,"date":"2026-06-18T13:27:23","date_gmt":"2026-06-18T11:27:23","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/wildvid-lip-a-lip-reading-dataset\/"},"modified":"2026-06-18T13:27:23","modified_gmt":"2026-06-18T11:27:23","slug":"wildvid-lip-a-lip-reading-dataset","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/wildvid-lip-a-lip-reading-dataset\/","title":{"rendered":"WildVid-Lip &#8212; A Lip Reading Dataset"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p><strong>Helloo<\/strong><\/p>\n<p>I have been working in the branch of lip reading for a while now. Currently there are about 100000 videos with youtube ids, start time, and end time of the clip. I am constantly working to reduce the friction in the dataset &#8212; as we cannot share the actual video clips from youtube &#8212; by adding download scripts and the actual transcripts in the near future.<\/p>\n<p>I have transcripts ready of about 80000 videos. The rest are yet to be made but since the dataset is constantly expanding (150,000 ish by end of day), transcripts would lack behind until I am done with the actual videos. <\/p>\n<p>Also trying to figure out how to <strong>not<\/strong> get rate-limited when downloading the videos from youtube using yt-dlp. If anyone knows, please enlighten me a bit \ud83d\ude42.<\/p>\n<p>My core aim is to make this a standard like LRS2,LRW,LRS3 etc.<\/p>\n<p>I will soon add a commercial subset in the dataset. Made from youtube videos which specifically allow commercial use so if someone wants to make a hardware out of it and bring it into the market, they can wholeheartedly do so :D.<\/p>\n<p>That&#8217;s mostly it.<\/p>\n<p>Have a look at the dataset if you would like to \ud83d\ude00<\/p>\n<p><a href=\"http:\/\/huggingface.co\/datasets\/Rizul2159\/WildVid-LIP\">huggingface.co\/datasets\/Rizul2159\/WildVid-LIP<\/a><\/p>\n<p>There isnt much right now on it. Just a csv file with 115k videos with their ids and timestamps but soon there would be a lot more than that.<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/Historical_Pin1429\"> \/u\/Historical_Pin1429 <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1u93i95\/wildvidlip_a_lip_reading_dataset\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1u93i95\/wildvidlip_a_lip_reading_dataset\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-41398 jlk' href='javascript:void(0)' data-task='like' data-post_id='41398' data-nonce='bc39e8310e' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-41398 lc'>0<\/span><\/a><\/div><\/div> <div class='status-41398 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>Helloo I have been working in the branch of lip reading for a while now. Currently there&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-41398","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/41398","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=41398"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/41398\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=41398"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=41398"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=41398"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}