Video Panoptic Segmentation Network

Introduced by Kim et al. in Video Panoptic Segmentation

Video Panoptic Segmentation Network, or VPSNet, is a model for video panoptic segmentation. On top of UPSNet, which is a method for image panoptic segmentation, VPSNet is designed to take an additional frame as the reference to correlate time information at two levels: pixel-level fusion and object-level tracking. To pick up the complementary feature points in the reference frame, a flow-based feature map alignment module is introduced along with an asymmetric attention block that computes similarities between the target and reference features to fuse them into one-frame shape. Additionally, to associate object instances across time, an object track head is added which learns the correspondence between the instances in the target and reference frames based on their RoI feature similarity.

Source: Video Panoptic Segmentation

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Instance Segmentation	1	14.29%
Panoptic Segmentation	1	14.29%
Semantic Segmentation	1	14.29%
Video Instance Segmentation	1	14.29%
Video Panoptic Segmentation	1	14.29%
Video Recognition	1	14.29%
Video Semantic Segmentation	1	14.29%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Video Panoptic Segmentation Models