Will Pay For Datasets That Contain Unredacted PDFs Of Purchase Orders, Invoices, And Supplier Contracts/Agreements (for Goods Not Services)

Hi r/datasets ,

I’m looking for datasets, either paid or unpaid, to create a benchmark for a specialised extraction pipeline.

Criteria:

  • Recent (last ten years ideally)
  • PDFs (don’t need to be tidy)
  • Not redacted (as much as possible)

Document types:

  • Supplier contracts (for goods not services)
  • Invoices (for goods not services)
  • Purchase Orders (for goods not services)

I’ve already seen: Atticus and UCSF Industry Document Library (which is the origin of Adam Harley’s dataset). I’ve seen a few posts below but they aren’t what I’m looking for. I’m honestly so happy to pay for the information and the datasets; dm me if you want to strike a deal.

submitted by /u/phililisaveslives
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *