it was a bunch of .txt files (containing the stories) and two xml?-files (or something) with additional metadata for the stories (title, first published, author, appeared in, rating on goodreads, rating on googlebooks etc etc) and the authors (biography, gender, name, country etc).
i remember i had to dig for it when i downloaded it like two weeks ago (just fried the laptop i saved them on, that’s why i need them again). there were some issues of the magazine Galaxy in it and a bunch of old stories: h.g. wells, asimov, de guin, and so on… i think it had a few hundred elements
if that description sounds familiar to anyone here i’d appreciate it if you could tell me where to get it again 🙂
EDIT: Christ alive, i found it: https://github.com/nschaetti/SFGram-dataset
submitted by /u/DrJotaroBigCockKujo
[link] [comments]