Sharing a dataset I’ve been maintaining since the MV Hondius hantavirus cluster started in early April.
Aggregated from primary public health sources: WHO Disease Outbreak News, CDC HAN advisories, ECDC bulletins, PAHO weekly reports, ProMED-mail, and national health ministries. Cron pulls every 30 minutes, normalizes case definitions per WHO DON600 framework, geocodes to city or province level where source data permits, dedupes against the archive.
Format: JSON
License: CC-BY-SA 4.0
Endpoint: https://hantaosint.com/api/v1/public.json
Dashboard: https://hantaosint.com
Methodology: https://hantaosint.com/methodology
Fields: case_id, date, country, region, virus_strain, confidence_level (confirmed/suspected/probable/monitoring), source, source_url, lat, lng
Confidence levels are kept separate rather than conflated, which most outbreak trackers don’t bother with. Historical outbreaks included for retrospective analysis: 1993 Four Corners, 2012 Yosemite, 2018-19 Epuyen.
Use cases I built it for: time-series modeling of cluster spread, retrospective comparison of hantavirus outbreaks, surveillance signal for travel medicine research.
Happy to add fields if researchers need additional structure. Open to feedback on the schema and source coverage.
submitted by /u/Professional_Art2346
[link] [comments]