CLIP

Contrastive Language-Image Pretraining - a multimodal model that understands both images and text, enabling powerful vision-language applications.

Related terms