Federal Contractor Violations Dataset [dataset][self-promotion]

I built a dataset joining USAspending federal contract awards to seven federal enforcement databases at the contractor level: OSHA, WHD, MSHA, EPA ECHO, NLRB, SEC, the UVA Corporate Prosecution Registry, and the SAM.gov debarment list. 5,557 contractors with documented violations, $3.19T in lifetime federal contracts, 758 OSHA-investigated fatalities.

The novel slice is the multi-agency overlap. Roughly 2000 contractors appear in 2+ federal enforcement databases. 500 in 3+. 70 in 4+. Topping the 4+ cohort by lifetime contract value: Raytheon ($68B, OSHA + WHD + NLRB + SEC + UVA), GE ($47B, same five), Merck, Microsoft, Austal USA, Marinette Marine.

Hugging Face: https://huggingface.co/datasets/FastDOLz/Federal-Contractor-Violations-Dataset

Kaggle: https://www.kaggle.com/datasets/benturneroffice365/federal-contractor-violations-dataset

Zenodo DOI (all versions): https://doi.org/10.5281/zenodo.20777627

Methodology + limitations: https://www.fastdol.com/methodology

CC-BY-4.0.

disclosure: I run FastDOL (https://www.fastdol.com), a federal workplace-enforcement search by employer, where this corpus comes from. Free for individual lookups; the dataset is one of several full extracts.

submitted by /u/chill-botulism
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *