[dataset] 2.3M U.S. Employer Profiles Joined Across 16 Federal Enforcement Agencies (OSHA, EPA, EEOC, WHD, MSHA, And More) — Free, CC BY 4.0

Full disclosure [self-promotion]: I’m the solo builder. Happy to answer questions about the data, methodology, or entity resolution approach.

I built FastDOL, a platform that links federal workplace enforcement records across agencies into a single employer profile. The government publishes this data, but each agency has its own database, its own identifiers, and its own terrible search UI.

The cross-agency dataset links enforcement records from OSHA, WHD, MSHA, EPA, EEOC, OFCCP, OFLC, and others at the employer level with parent-company rollup. The interesting finding: employers cited by 3+ agencies have a 3.4x higher worker fatality rate than employers cited by 1-2 agencies.

Four open datasets available so far, all CC BY 4.0:

  • Cross-Agency Federal Violations by Employer (~2.3M rows)
  • OSHA Construction Enforcement by Employer (377K rows)
  • OSHA Citations Q1 2026 (28,827 rows, citation-level)
  • WHD Wage Theft Enforcement Actions by Employer

All hosted on Hugging Face, Kaggle, and Zenodo with DOIs. Full schema, methodology, and BibTeX on the canonical pages: https://www.fastdol.com/datasets

submitted by /u/chill-botulism
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *