ITOP (Invariant-Top View Dataset)

Introduced by Haque et al. in Towards Viewpoint Invariant 3D Human Pose Estimation

The ITOP dataset consists of 40K training and 10K testing depth images for each of the front-view and top-view tracks. This dataset contains depth images with 20 actors who perform 15 sequences each and is recorded by two Asus Xtion Pro cameras. The ground-truth of this dataset is the 3D coordinates of 15 body joints.

Source: V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map

Homepage