{"id":40959,"date":"2026-05-13T23:27:06","date_gmt":"2026-05-13T21:27:06","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/20k-reddit-crypto-sentiment-dataset-with-bitcoin-market-labels\/"},"modified":"2026-05-13T23:27:06","modified_gmt":"2026-05-13T21:27:06","slug":"20k-reddit-crypto-sentiment-dataset-with-bitcoin-market-labels","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/20k-reddit-crypto-sentiment-dataset-with-bitcoin-market-labels\/","title":{"rendered":"20k Reddit Crypto Sentiment Dataset With Bitcoin Market Labels"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p>I recently created my first public dataset focused on cryptocurrency sentiment analysis and Bitcoin market forecasting. The dataset contains around 20,000 Reddit posts collected from major crypto communities between 2017 and 2025 using the PRAW API.<\/p>\n<p>It includes:<\/p>\n<ul>\n<li>Reddit post metadata<\/li>\n<li>Cleaned text features<\/li>\n<li>Crypto-enhanced VADER sentiment<\/li>\n<li>Custom FinBERT sentiment scores<\/li>\n<li>Bitcoin prices and returns<\/li>\n<li>Binary BTC movement labels for 1h, 6h, 12h, and 24h horizons<\/li>\n<\/ul>\n<p>The dataset was built for financial NLP, sentiment analysis, and forecasting research. I am still learning dataset engineering and would appreciate feedback, suggestions, or ideas for improvement.<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/Cyclo_Studios\"> \/u\/Cyclo_Studios <\/a> <br \/> <span><a href=\"https:\/\/www.kaggle.com\/datasets\/shisha01\/reddit-crypto-sentiment-and-market-trend-dataset\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1tccp9f\/20k_reddit_crypto_sentiment_dataset_with_bitcoin\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-40959 jlk' href='javascript:void(0)' data-task='like' data-post_id='40959' data-nonce='65e0e39b87' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-40959 lc'>0<\/span><\/a><\/div><\/div> <div class='status-40959 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>I recently created my first public dataset focused on cryptocurrency sentiment analysis and Bitcoin market forecasting. The&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-40959","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/40959","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=40959"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/40959\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=40959"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=40959"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=40959"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}