Hi Suryanarayanan,

1 min readApr 15, 2024

Hi Suryanarayanan,

Thanks. You question is not that trivial. So basically when we say fine tune, we take a good foundational model like LLAMA2 or Mistral with 7 or 13 billion parameters and using Quantisation and LORA squeeze these into our modest training hardware. These models have good understanding already of language and some basic reasoning skills.

There are many ways to fine tune this model. The standard way is to do Instructution Tuning with generated QA paris (either human generated or by another LLM like GPT3.5 or 4)

The other way is to do causal training on your particular domain data first. That is unsupervised training on your domain data Example like this https://colab.research.google.com/drive/1So6PdRWe8LJZ-uLTlxADATKvLbvrKAh3?usp=sharing#scrollTo=A33jO55r5qqQ

To get a good output it depends on many factors including the data length, how it matches with the knowledge the foundation model has etc.

The hardest thing when you do domain fine tuning is that you expect precise answers, but LLMs mix your training data with its exisitng trained data and reducing hallicunations are a challenge.

Instruction Tuning is a much safer bet

Written by Alex Punnen

No responses yet