[self-promotion] Free Sample: EU Public Procurement Notices (Aug 2025, CSV, Enriched With CPV Codes)

I’ve released a new dataset built from the EU’s Tenders Electronic Daily (TED) portal, which publishes official public procurement notices from across Europe.

  • Source: Official TED monthly XML package for August 2025
  • Processing: Parsed into a clean tabular CSV, normalized fields, and enriched with CPV 2008 labels (Common Procurement Vocabulary).
  • Contents (sample):
    • notice_id — unique identifier
    • publication_date — ISO 8601 format
    • buyer_id — anonymized buyer reference
    • cpv_code + cpv_label — procurement category (CPV 2008)
    • lot_id, lot_name, lot_description
    • award_value, currency
    • source_file — original TED XML reference

This free sample contains 100 rows representative of the full dataset (~200k rows).
Sample dataset on Hugging Face

If you’re interested in the full month (200k+ notices), it’s available here:
Full dataset on Gumroad

Suggested uses: training NLP/ML models (NER, classification, forecasting), procurement market analysis, transparency research.

Feedback welcome — I’d love to hear how others might use this or what extra enrichments would be most useful.

submitted by /u/OpenMLDatasets
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *