Dataset: SEC Cyber Incidents Disclosures Labeled By Threat Type And Impact

Disclosure: I created and host this dataset.

I compiled a dataset of 80 cybersecurity incident disclosures from SEC filings (primarily 8-K reports) and labeled them using a structured taxonomy.

The goal was to create a more usable dataset for analyzing real-world cyber incidents based on public disclosures.

Dataset includes:

Threat type classification (ransomware, data theft, insider, supply chain, etc.)
Indicators of business impact (operational disruption, recovery status)
Sector categorization (e.g., financial services)
Whether cyber insurance was mentioned
Source filing references (SEC EDGAR)

Some high-level observations from the dataset:

~72% of cases indicate incomplete recovery or significant disruption
50% involve data theft or exposure
Financial services is the most represented sector
~18% mention cyber insurance

Methodology:

Source: SEC EDGAR (8-K incident disclosures)
Manual review of each case
Consistent tagging using a predefined taxonomy
AI used to assist classification consistency (not fully automated)

Limitations:

Disclosure quality varies significantly
Many filings are intentionally vague
Sample size is still relatively small (n=80)

submitted by /u/LordKittyPanther
[link] [comments]

Leave a Reply Cancel reply