About
The idea behind CustomCodeAssist is to be able to easily fine-tune a foundation LLM on a codebase, so that it can provide tailored assistance. In its simplest form: - You would provide a link to a public Gitlab/Github repository, which would be downloaded automatically and parsed into a format which can be ingested by the training pipeline. - A foundation LLM like code-llama-2 would be fine-tuned on the code. - You can then do inference on the model and get outputs which are specific to your codebase. My intuition tells me that simply fine-tuning the model on the raw code won't be sufficient to produce high-quality tailored outputs (more likely it will just fall back on other code it has been trained on). In this case, investigation instruct-based fine-tuning might be a good option. There are a number of other additions/improvements that can be made, like using RAG on your codebase to provide context to the prompt.