How To Split A Dataset Into 2 To Check For Generalization Over Memorization?

I wish to ensure that a neural network does generalization rather than memorization.

in terms of using 1 dataset that is a collection of social media chats, would it be sufficent to split it chornologically only so to create 2 datasets?

or something more needs to be done like splitting it into different usernames and channel names being mentioned.

basically I only have 1 dataset but I wish to make 2 datasets out of it so that one is for supervised learning for the model and the other is to check how well the model performs

submitted by /u/Calm_Maybe_4639
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *