I am building a metadata-only index for AI image discovery packs and wanted feedback from people who actually use datasets.
Current shape:
- one JSONL record per image
- prompt fragments when available
- source URL and creator/source attribution fields
- safety labels
- category/style tags
- pack manifests for small curated image sets
- no upstream image files included in the first pass
Example manifest and records are here: https://generatedgallery.com/index/manifest.json https://generatedgallery.com/index/generated-gallery.sample.json
Protocol notes: https://generatedgallery.com/protocol
The use case is prompt research, moodboards, model eval sets, and image discovery where provenance does not get stripped away.
What fields would make this more useful before I publish a larger metadata-only dataset repo?
submitted by /u/Plane-Marionberry380
[link] [comments]