Two types of large language models(LLMs)
- Base LLM.s
- Predicts the next word based on text training data
- Instruction-tuned LLM.s
- Tries to follow instructions
- Trained: you start off with a base LLM
- Fine-tuned on instructions and good at following those instructions
- Those instructions and then often further refine using a technique called R.L.H.F
- RLHF: Reinforcement Learning with Human Feedback
- Helpful, Honest, Harmless
When an LLM doesn't work, sometimes it's because the instructions weren't clear enough.