Reinforcement Learning from Human Feedback (RLHF)

A technique used to align AI models, especially LLMs, more closely with human preferences and instructions. It involves collecting human feedback on model outputs and using this feedback to further train or fine-tune the model, often to improve helpfulness and reduce harmful or biased responses.