QLoRA (Quantized Low-Rank Adaptation)

An extremely memory-efficient fine-tuning method that combines quantization with LoRA, enabling fine-tuning of large models (like 65B parameter models) on consumer GPUs. Reduces memory requirements by up to 10x compared to standard fine-tuning.