Hi guys, I’ve been working on a fine tuned llama3 for quite some time now and want to expand the dataset. Are there any good automated solutions to generate these datasets from pdf or html and can these be augmented automatically?
Thanks so much in advance
submitted by /u/OkVegetable2512
[link] [comments]