1 code implementation • 14 Nov 2021 • Arulkumar Subramaniam, Jayesh Vaidya, Muhammed Abdul Majeed Ameen, Athira Nambiar, Anurag Mittal
In this work, we propose to utilize the common rationale that a sequence of video frames capture a set of common objects and interactions between them, thus a notion of co-segmentation between the video frame features may equip the model with the ability to automatically focus on task-specific salient regions and improve the underlying task's performance in an end-to-end manner.