Dataset of SEC filing word counts from 1993-2000 (inclusive). 1.7gb total, split across 40 ORC files. Disclaimer: I made this. MIT License.
GitHub Link: https://github.com/john-friedman/sec-filing-wordcounts-1993-2000/tree/main
submitted by /u/status-code-200
[link] [comments]