Best AI Tools
Tools
Top 100
AI News
Learn
Compare
Partner
Submit Tool
AI Glossary
/
Quantization
Quantization
Reducing numerical precision of model weights/activations (e.g., FP16 → INT8) to lower memory footprint and increase inference speed, often with minimal quality loss.
Related terms
Inference
Latency
Throughput
View on glossary index