Best AI Tools
Tools
Top 100
AI News
Learn
Compare
Partner
Submit Tool
AI Glossary
/
Multi‑Query Attention (MQA)
Multi‑Query Attention (MQA)
An attention optimization that shares key/value across heads for lower memory and faster decoding.
Related terms
GQA
FlashAttention
View on glossary index