96 Million INaturalist Research-grade Plant Records Dataset (free And Open Source)

I’ve built a large-scale plant dataset from iNaturalist research-grade observations:
96.1 million rows containing:

  • species / genus / family names
  • GBIF taxonomy IDs
  • lat / lon
  • event dates
  • image URLs (iNat open data)
  • license information
  • dataset keys / source info

It’s meant for anyone doing:

  • image classification (plants, ecology, biodiversity)
  • large-scale ViT/ConvNext pretraining
  • location-aware species modelling
  • weak-supervised learning from image URLs
  • training LoRA adapters for regional plant ID

Dataset (parquet, streamable via HF Datasets):
https://huggingface.co/datasets/juppy44/gbif-plants-raw

let me know what you build with it!

submitted by /u/Lonely-Marzipan-9473
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *