VQA (Visual Question Answering)

Answering natural‑language questions about images or video.