Open Vocabulary Action Recognition

2 papers with code • 2 benchmarks • 2 datasets

Open Vocabulary Action Recognition (OVAR) aims to generalize beyond the predefined set of actions seen during training. The actions (verbs or verb-object pairs) are provided as textual queries during inference and no prior knowledge about them is assumed to be known during training.

Most implemented papers

Opening the Vocabulary of Egocentric Actions

dibschat/openvocab-egoAR NeurIPS 2023

Given a set of verbs and objects observed during training, the goal is to generalize the verbs to an open vocabulary of actions with seen and novel objects.

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

kunyulin/xov-action 3 Mar 2024

To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.