{"id":39860,"date":"2026-03-23T04:28:17","date_gmt":"2026-03-23T03:28:17","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/self-promotion-paid-i-built-a-1437-column-alternative-financial-dataset-that-fuses-gdelt-news-intelligence-ai-sentiment-and-multi-source-price-at-15-minute-resolution-free-sample-inside\/"},"modified":"2026-03-23T04:28:17","modified_gmt":"2026-03-23T03:28:17","slug":"self-promotion-paid-i-built-a-1437-column-alternative-financial-dataset-that-fuses-gdelt-news-intelligence-ai-sentiment-and-multi-source-price-at-15-minute-resolution-free-sample-inside","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/self-promotion-paid-i-built-a-1437-column-alternative-financial-dataset-that-fuses-gdelt-news-intelligence-ai-sentiment-and-multi-source-price-at-15-minute-resolution-free-sample-inside\/","title":{"rendered":"[Self-Promotion] [Paid] I Built A 1,437-column Alternative Financial Dataset That Fuses GDELT News Intelligence, AI Sentiment, And Multi-source Price At 15-minute Resolution. Free Sample Inside."},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p><a href=\"https:\/\/imgur.com\/IL9hy7s\">Chart overview \u2014 5 panels of real NVDA data<\/a><\/p>\n<p><strong>What it is<\/strong><\/p>\n<p>ULTRA is a flat CSV dataset that aligns three data layers on the same 15-minute timestamp:<\/p>\n<ul>\n<li><strong>GDELT<\/strong> (~1,256 cols): The full GCAM emotional spectrum \u2014 WordNet Affect, SentiWordNet, Harvard IV, AFINN, Loughran-McDonald financial sentiment, Moral Foundations, plus geopolitical events (GoldsteinScale, QuadClass, CAMEO codes), media mentions, entity extraction, and macro themes.<\/li>\n<li><strong>AI Analysis<\/strong> (18 cols): Contextual sentiment from Gemini \u2014 not word-counting, but actual comprehension of <em>why<\/em> sentiment is negative (export controls vs earnings miss vs CEO departure). Includes impact, novelty, actionability, narrative codes, and binary flags.<\/li>\n<li><strong>Price<\/strong> (16 cols): Multi-source OHLCV from Polygon.io + Twelve Data, VWAP, trade count, cross-source mean and spread, 15-min return.<\/li>\n<\/ul>\n<p>96 timestamps per day. Currently covering the Magnificent Seven (AAPL, AMZN, GOOG, META, MSFT, NVDA, TSLA).<\/p>\n<p><strong>Free sample + data dictionary<\/strong><\/p>\n<p>Full day of NVDA data (Jan 2, 2026) \u2014 all 1,437 columns, 96 rows. No paywall, no signup.<\/p>\n<p>\u2192 <strong>Sample CSV:<\/strong> <a href=\"https:\/\/marketsignal.solutions\/data\/samples\/ULTRA_sample_NVDA.csv\">marketsignal.solutions\/data\/samples\/ULTRA_sample_NVDA.csv<\/a> \u2192 <strong>Data Dictionary:<\/strong> <a href=\"https:\/\/marketsignal.solutions\/data\/samples\/ULTRA_DataDictionary.txt\">marketsignal.solutions\/data\/samples\/ULTRA_DataDictionary.txt<\/a><\/p>\n<p><strong>Quick load:<\/strong><\/p>\n<pre><code>import pandas as pd df = pd.read_csv(\"ULTRA_sample_NVDA.csv\") print(f\"{df.shape[1]} columns, {df.shape[0]} timestamps\") # AI sentiment + price at market open cols = [\"meta_timestamp\", \"ai_sentiment_score\", \"ai_impact_score\", \"ai_narrative_primary_code\", \"poly_close\", \"price_return_15m\"] print(df[df[\"poly_close\"].notna()][cols].head(10).to_string(index=False)) <\/code><\/pre>\n<p><strong>Why I built it<\/strong><\/p>\n<p>GDELT is incredible \u2014 it&#8217;s the world&#8217;s largest open news database. But it&#8217;s raw, unfiltered, and has no ticker mapping. If you want to use it for quant research, you need months of pipeline engineering just to get it into a usable format.<\/p>\n<p>I built the pipeline that: 1. Ingests 3 GDELT streams every 15 minutes (GKG, Events, Mentions) 2. Matches articles to S&amp;P 100 tickers via org-name resolution 3. Parses all 1,256 GCAM dimensions per ticker 4. Runs Gemini AI on every batch for contextual analysis 5. Fuses with multi-source verified price data<\/p>\n<p>The result is a single CSV you can <code>pd.read_csv()<\/code> and start researching.<\/p>\n<p><strong>What I&#8217;m NOT claiming<\/strong><\/p>\n<ul>\n<li>This is not &#8220;beat the market&#8221; data. It&#8217;s research-grade alternative data.<\/li>\n<li>GDELT is open\/public \u2014 I didn&#8217;t create it. I created the pipeline, the AI layer, and the fusion.<\/li>\n<li>Coverage is currently 7 tickers (Mag 7). S&amp;P 100 expansion is in progress.<\/li>\n<li>The AI layer depends on Gemini \u2014 it&#8217;s contextual NLP, not proprietary.<\/li>\n<\/ul>\n<p><strong>Pricing<\/strong><\/p>\n<p>$99\/month for the Mag 7 live feed. Details at <a href=\"https:\/\/marketsignal.solutions\/\">marketsignal.solutions<\/a>.<\/p>\n<p>Happy to answer any questions about the data, the pipeline, or the methodology.<\/p>\n<hr \/>\n<p><em>This dataset is for research purposes. Past patterns do not guarantee future performance.<\/em><\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/SuggestionDry6614\"> \/u\/SuggestionDry6614 <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1s156dd\/selfpromotion_paid_i_built_a_1437column\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1s156dd\/selfpromotion_paid_i_built_a_1437column\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-39860 jlk' href='javascript:void(0)' data-task='like' data-post_id='39860' data-nonce='72e055e984' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-39860 lc'>0<\/span><\/a><\/div><\/div> <div class='status-39860 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>Chart overview \u2014 5 panels of real NVDA data What it is ULTRA is a flat CSV&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-39860","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/39860","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=39860"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/39860\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=39860"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=39860"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=39860"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}