{"id":33427,"date":"2025-04-09T08:27:17","date_gmt":"2025-04-09T06:27:17","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/jfk-tell-hf-dataset-for-jfk-assassination-records\/"},"modified":"2025-04-09T08:27:17","modified_gmt":"2025-04-09T06:27:17","slug":"jfk-tell-hf-dataset-for-jfk-assassination-records","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/jfk-tell-hf-dataset-for-jfk-assassination-records\/","title":{"rendered":"JFK-TELL: HF Dataset For JFK Assassination Records"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p>The JFK assassination has been an unassailable mystery even after decades of investigations by premier agencies, the media, and ordinary people. A large-scale analysis of the assassination records may offer new clues, and help substantiate or refute some of the theories. There are about six million files related to the event that are to be made public through <a href=\"https:\/\/www.archives.gov\/research\/jfk\">archives.org<\/a> over time.<\/p>\n<p>I am releasing <a href=\"https:\/\/huggingface.co\/datasets\/farhanhubble\/jfk-tell\">JFK-TELL<\/a>, a dataset I generated by extracting text from the scanned PDFs of the assassination records released until April 2025. The extraction was done with Google Gemini LLM API to generate Markdown text, using a very simple prompt. For detailed methodology, check out the Github <a href=\"https:\/\/github.com\/farhanhubble\/jfk-tell\">repo<\/a>.<\/p>\n<p>I plan to index this data with a RAG system and analyze it later. In the meantime writers, journalists, computational linguists, and data scientists can try their hands on the breadth and variety of this data.<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/farhanhubble\"> \/u\/farhanhubble <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1juz2y8\/jfktell_hf_dataset_for_jfk_assassination_records\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1juz2y8\/jfktell_hf_dataset_for_jfk_assassination_records\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-33427 jlk' href='javascript:void(0)' data-task='like' data-post_id='33427' data-nonce='bc39e8310e' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-33427 lc'>0<\/span><\/a><\/div><\/div> <div class='status-33427 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>The JFK assassination has been an unassailable mystery even after decades of investigations by premier agencies, the&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-33427","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/33427","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=33427"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/33427\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=33427"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=33427"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=33427"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}