Q: Fine-tuning Coding LLMs On Git[hub] Histories Rather Than Just Final Code?

I run a small software company creating traditional C++ desktop apps for font & graphic design work. We have 10+ years of Git histories of our apps.

What open “coding” LLMs are there out there that weren’t just trained on final code but on Git histories (commits & pull requests), and Github stuff (PR discussions, issues etc.)?

What dataset formats for such data would be advisable to use?

I’d like to fine-tune a coding LLM to privately assist in our software development, ideally not just on the current state of the code but on its evolution.

I have a “feeling” that this would be much better. 🙂

submitted by /u/Minimum_Art_2263
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *