Datasets Or Pre-trained Models For Banner Ad / Marketing Text Classification?

I am trying to find good datasets for classifying web images as ads, so that I can use it to train an image classification model for filtering out ads and only downloading useful image content from websites. I would also be interested in sets for classifying marketing/ad text to help with filtering out ad captions as well. I’m suspecting that there might be issues with copyright that are preventing people from releasing ad sets publicly, but I’m hoping that something is out there.
I found this dataset on PapersWithCode, and several sets that use old banner ads from the 90s/early 2000s, but I am wondering if there are any other publicly available web ad datasets with more recent data.
Does anyone have suggestions on good quality public datasets or preexisting classification models for ad detection?

submitted by /u/jferments
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *