Best AI Tools
Tools
Top 100
AI News
Learn
Compare
Partner
Submit Tool
Learn
Glossary
Vision-Language Model (VLM)
Vision-Language Model (VLM)
AI models that can process and understand both images and text, enabling tasks like image captioning, visual question answering, and multimodal reasoning. Examples include GPT-4V (Vision), Claude 3, and Gemini Pro Vision.
Related terms
Multimodal AI
Computer Vision
LLM
View on glossary index