Recently I scraped 56,400 question/answer pairs off Quora, and put the dataset on the HuggingFace hub. I plan to continually add to the dataset, but proxy costs are pretty expensive since Quora is hella bloated.
The dataset can be accessed through the HuggingFace profile linked in my article, if anyone is interested : https://www.toughdata.net/blog/post/finetune-flan-t5-question-answer-quora-dataset
submitted by /u/jankybiz
[link] [comments]