Zeroshot Video Question Answer
2 papers with code • 3 benchmarks • 3 datasets
This task has no description! Would you like to contribute one?
Most implemented papers
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Manual annotation of question and answers for videos, however, is tedious and prohibits scalability.
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens
This paper introduces MiniGPT4-Video, a multimodal Large Language Model (LLM) designed specifically for video understanding.