{"id":35328,"date":"2025-09-07T03:27:27","date_gmt":"2025-09-07T01:27:27","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/%f0%9f%93%8a-new-dataset-2-6m-ai-enriched-company-profiles-across-100-industries-jsonl-parquet-csv\/"},"modified":"2025-09-07T03:27:27","modified_gmt":"2025-09-07T01:27:27","slug":"%f0%9f%93%8a-new-dataset-2-6m-ai-enriched-company-profiles-across-100-industries-jsonl-parquet-csv","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/%f0%9f%93%8a-new-dataset-2-6m-ai-enriched-company-profiles-across-100-industries-jsonl-parquet-csv\/","title":{"rendered":"\ud83d\udcca New Dataset: 2.6M+ AI-enriched Company Profiles Across 100+ Industries (JSONL \/ Parquet \/ CSV)"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p>Hi all,<\/p>\n<p>I\u2019ve been working on a side project where I crawled and AI-enriched over <strong>2.6 million company websites<\/strong> across <strong>111 industries<\/strong> worldwide.<\/p>\n<p><strong>What\u2019s inside:<\/strong><\/p>\n<ul>\n<li>Company name, website, industry<\/li>\n<li>Long + short descriptions (AI-generated)<\/li>\n<li>Enriched metadata (socials, emails, locations where available)<\/li>\n<li>Website screenshots<\/li>\n<li>Delivered in <strong>JSONL, Parquet, and CSV<\/strong> formats<\/li>\n<\/ul>\n<p><strong>Access:<\/strong><\/p>\n<ul>\n<li>A <strong>free sample explorer<\/strong> with 150 companies is live here: <a href=\"https:\/\/ctxdb.ai\/sample-dataset\">https:\/\/ctxdb.ai\/sample-dataset<\/a><\/li>\n<li>Full dataset available for purchase (Q3 2025 edition + Q4 coming soon).<\/li>\n<li>A yearly \u201cMomentum Plan\u201d also refreshes the dataset quarterly with new companies + updated profiles.<\/li>\n<\/ul>\n<p><strong>Why I built this:<\/strong><\/p>\n<p>I wanted an up-to-date, structured dataset useful for:<\/p>\n<ul>\n<li>Lead generation \/ prospecting<\/li>\n<li>Market research &amp; competitive tracking<\/li>\n<li>AI\/ML model training<\/li>\n<li>Academic or investment research<\/li>\n<\/ul>\n<p>Happy to hear your thoughts \/ feedback \/ need for API access? &#8211; also curious how you\u2019d use a dataset like this.<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/karngyan\"> \/u\/karngyan <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1nag5zh\/new_dataset_26m_aienriched_company_profiles\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1nag5zh\/new_dataset_26m_aienriched_company_profiles\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-35328 jlk' href='javascript:void(0)' data-task='like' data-post_id='35328' data-nonce='65e0e39b87' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-35328 lc'>0<\/span><\/a><\/div><\/div> <div class='status-35328 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>Hi all, I\u2019ve been working on a side project where I crawled and AI-enriched over 2.6 million&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-35328","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/35328","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=35328"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/35328\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=35328"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=35328"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=35328"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}