{"id":40782,"date":"2026-05-03T10:27:19","date_gmt":"2026-05-03T08:27:19","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/i-got-tired-of-checking-kaggle-huggingface-data-gov-and-other-sites-every-time-i-needed-a-dataset-so-i-built-a-tool-that-searches-all-of-them-at-once\/"},"modified":"2026-05-03T10:27:19","modified_gmt":"2026-05-03T08:27:19","slug":"i-got-tired-of-checking-kaggle-huggingface-data-gov-and-other-sites-every-time-i-needed-a-dataset-so-i-built-a-tool-that-searches-all-of-them-at-once","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/i-got-tired-of-checking-kaggle-huggingface-data-gov-and-other-sites-every-time-i-needed-a-dataset-so-i-built-a-tool-that-searches-all-of-them-at-once\/","title":{"rendered":"I Got Tired Of Checking Kaggle, HuggingFace, Data.gov, And Other Sites Every Time I Needed A Dataset, So I Built A Tool That Searches All Of Them At Once"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p>Disclosure: I&#8217;m one of the creators of this tool.<\/p>\n<p>Hi all,<\/p>\n<p>I do ML research at Berkeley and the most tedious part of every project is dataset discovery. I&#8217;d spend hours opening tabs across Kaggle, HuggingFace, <a href=\"http:\/\/data.gov\/\">data.gov<\/a>, Census, WHO, Semantic Scholar, and a dozen other platforms just to find the right data. Then I&#8217;d have to manually check licenses, preview columns, and figure out citations.<\/p>\n<p>So my friend and I built Mobus, an open-source MCP server that lets you do all of that from inside Claude or Cursor. You describe what you need in natural language and it searches across 20 platforms, lets you preview the actual data, checks licenses, and generates citations.<\/p>\n<p>It&#8217;s free and open source: <a href=\"https:\/\/github.com\/mobus-ai\/Mobus\">https:\/\/github.com\/mobus-ai\/Mobus<\/a><\/p>\n<p>Quick demo on the site if you want to see it in action: <a href=\"https:\/\/mobus.ai\/\">https:\/\/mobus.ai<\/a><\/p>\n<p>Would love feedback from anyone who deals with this pain point. What data sources are missing that you&#8217;d want to see added?<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/Swimming_Outside_988\"> \/u\/Swimming_Outside_988 <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1t2fohg\/i_got_tired_of_checking_kaggle_huggingface\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1t2fohg\/i_got_tired_of_checking_kaggle_huggingface\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-40782 jlk' href='javascript:void(0)' data-task='like' data-post_id='40782' data-nonce='65e0e39b87' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-40782 lc'>0<\/span><\/a><\/div><\/div> <div class='status-40782 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>Disclosure: I&#8217;m one of the creators of this tool. Hi all, I do ML research at Berkeley&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-40782","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/40782","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=40782"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/40782\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=40782"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=40782"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=40782"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}