I couldn’t find a good dataset that mapped the “Skills Gap” between university and industry, so I built a local scraper to create one.
The Data:
- Volume: ~52,000 threads.
- Fields: Title, Body, Top Comments, Sentiment.
- Focus: Keywords relating to “Exams” vs “Workplace Tools”.
I built the extractor (ORION) to run locally so I wouldn’t get IP banned. It uses requests and smart rate-limiting.
You can grab the tool and the extraction logic here: https://mrweeb0.github.io/ORION-tool-showcase/
Feel free to fork it if you want to scrape other career subreddits (like Nursing or CS).
submitted by /u/No-Associate-6068
[link] [comments]