{"id":23123,"date":"2023-10-17T14:27:53","date_gmt":"2023-10-17T12:27:53","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/looking-for-datasets-with-tons-of-doc-and-docx-files\/"},"modified":"2023-10-17T14:27:53","modified_gmt":"2023-10-17T12:27:53","slug":"looking-for-datasets-with-tons-of-doc-and-docx-files","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/looking-for-datasets-with-tons-of-doc-and-docx-files\/","title":{"rendered":"Looking For Datasets With Tons Of Doc And Docx Files"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p>Hi, I&#8217;m planning to build an application to help people with dyslexia get better access to documents using ML. I am looking for datasets of doc and docx files with textual content that might not be very friendly to dyslexic people for training my ML model. Can someone help me in finding such datasets?<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/Bitter-Name-6594\"> \/u\/Bitter-Name-6594 <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/179wy92\/looking_for_datasets_with_tons_of_doc_and_docx\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/179wy92\/looking_for_datasets_with_tons_of_doc_and_docx\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-23123 jlk' href='javascript:void(0)' data-task='like' data-post_id='23123' data-nonce='65e0e39b87' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-23123 lc'>0<\/span><\/a><\/div><\/div> <div class='status-23123 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>Hi, I&#8217;m planning to build an application to help people with dyslexia get better access to documents&#8230;<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-23123","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/23123","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=23123"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/23123\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=23123"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=23123"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=23123"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}