Hello!
I want to make a classifier to detect plagiarism of startup ideas using NLP, but I can’t find a suitable dataset to test my model on. I am looking for a labeled textual dataset that contains the original text and its corresponding plagiarised version, or any dataset that is suitable for this case.
Thanks in advance.
submitted by /u/AlphaTea_Lover
[link] [comments]