I am building a project and I want to fine-tune an LLM to incorporate it as a ChatBot.
The ChatBot will deliver feedback to students who submit programming solutions for exercises they are solving. I want to train the ChatBot on a specific way to give feedback like not giving the correct answer explicitly and not answering questions unrelated to the domain, and also being able to give hints when a student asks for it.
I couldn’t find a dataset close to what I need. Obviously I will need to clean any dataset that I find to match my needs perfectly.
If you know of any dataset that might help me with this, or any way that I can automate the generation of a mock dataset, because ChatGPT has limitions and I wasn’t able to make it generate the number of examples I need.
submitted by /u/iTsObserv
[link] [comments]