I’ve been messing around with web scraping for a while (mostly extracting data on what software websites are running under the hood).
I decided to clean up some of the data and open-source a sample dataset of 500 companies mapped to the tech they use (Stripe, React, Shopify, AWS, etc.). It’s in CSV/JSON.
It’s not a massive dataset by any means, but I figured it might be handy if anyone here needs some real-world data for a side project, practicing pandas/data analysis, or testing out your own scripts without having to build a scraper from scratch.
Repo is here: https://github.com/leadita/tech-stack-datasets
submitted by /u/haynajjar
[link] [comments]