I need to turn a bunch of academic PDFs (with tables) into neat JSON files for data extraction. I’m searching for a Python OCR tool that can: do text and table recognition in scholarly papers; spit out well-structured JSON with the extracted info. If you’ve got recommendations, please let me know! Open-source is awesome, but I’m open to anything that does the job well.
Thanks a for your help!
submitted by /u/Apprehensive_View366
[link] [comments]