[Tool] Built An API To Instantly Extract Any Public HTML Table Or Wikipedia Page Into A Clean JSON Data Matrix

Hey r/datasets,

I got tired of manually copying data tables or dealing with messy HTML structures when trying to feed data into my personal scripts and models.

To solve this, I built and hosted a lightweight cloud API that automatically scrapes public web pages, isolates the tables/data grids, and packages everything into an organized, nested JSON matrix.

I wanted to share it here for anyone looking to automate their data gathering pipelines. I set up a free testing tier on RapidAPI that gives you 50 free requests a month to play around with it:

https://rapidapi.com/patcicci4/api/housing-and-wikipedia-data-scraper

Let me know if you test it out or have any feedback on extra features I should add to the parser!

submitted by /u/Cyclonefan444
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *