Prompt Caching

Reusing precomputed prompt or prefix representations to reduce latency and cost for repeated or shared context.