PagedAttention

An attention memory management technique (e.g., vLLM) that enables high‑throughput serving.