Attention Mechanism

A key component in Transformer models (like those used in LLMs) that allows the model to weigh the importance of different parts of the input sequence when processing information, crucial for understanding context and long-range dependencies.