Dataset: SEC Cyber Incidents Disclosures Labeled By Threat Type And Impact

Disclosure: I created and host this dataset.

I compiled a dataset of 80 cybersecurity incident disclosures from SEC filings (primarily 8-K reports) and labeled them using a structured taxonomy.

The goal was to create a more usable dataset for analyzing real-world cyber incidents based on public disclosures.

Dataset includes:

  • Threat type classification (ransomware, data theft, insider, supply chain, etc.)
  • Indicators of business impact (operational disruption, recovery status)
  • Sector categorization (e.g., financial services)
  • Whether cyber insurance was mentioned
  • Source filing references (SEC EDGAR)

Some high-level observations from the dataset:

  • ~72% of cases indicate incomplete recovery or significant disruption
  • 50% involve data theft or exposure
  • Financial services is the most represented sector
  • ~18% mention cyber insurance

Methodology:

  • Source: SEC EDGAR (8-K incident disclosures)
  • Manual review of each case
  • Consistent tagging using a predefined taxonomy
  • AI used to assist classification consistency (not fully automated)

Limitations:

  • Disclosure quality varies significantly
  • Many filings are intentionally vague
  • Sample size is still relatively small (n=80)

submitted by /u/LordKittyPanther
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *