Multimodal AI

AI systems capable of processing, understanding, and generating information from multiple types of data (modalities) simultaneously, such as text, images, audio, and video (e.g., GPT-4o, Gemini).