[DATASET] Polymarket Prediction Market: 5.5 Billion Tick-level Orderbook Records, 21 Days, L2 Depth Snapshots, Trade Executions, Resolution Labels (CC-BY-NC-4.0)

Published a large-scale tick-level dataset from Polymarket, the largest prediction market. Useful for microstructure research, market efficiency studies, and ML on event-driven markets.

Scale:

Metric Count
Orderbook ticks 5,555,777,555
L2 depth snapshots 51,674,425
Trade executions 4,126,076
Markets tracked 123,895
Resolved markets 23,146
ML feature bars 5,587,547
Coverage 21 continuous days
Null values 0

Format: Daily Parquet files (ZSTD compressed), around 40 GB total. Includes pre-built 1-minute bar features with L2 depth imbalance ready for ML training on Kaggle’s free tier.

License: CC-BY-NC-4.0 (non-commercial/academic)

Link: https://www.kaggle.com/datasets/marvingozo/polymarket-tick-level-orderbook-dataset

Use cases: HFT signal detection, market maker strategy research, prediction efficiency studies, order flow toxicity (VPIN), cross-market correlation, event study analysis.

submitted by /u/Upset-Fly-454
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *