Dataset Containing Informal/formal Text?

Does anyone know of a publicly available dataset in any language containing formal discursive text along with a “parallel”, less formal text or know of any place where one can create such a dataset (like English Wikipedia articles and corresponding Simple Wikipedia articles)? The GYAFC dataset (Rao et al. 2018) is similar to what I’m looking for.

submitted by /u/geartrains
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *