In Depth
LoRA freezes the original model weights and adds small trainable matrices to each layer. This reduces fine-tuning cost by 10-100x and allows multiple fine-tuned versions to share the same base model. QLoRA combines LoRA with quantization for even greater efficiency.