Batching (LLM Serving)

Combining multiple requests for parallel processing to improve throughput and reduce cost.