The ITOP dataset consists of 40K training and 10K testing depth images for each of the front-view and top-view tracks. This dataset contains depth images with 20 actors who perform 15 sequences each and is recorded by two Asus Xtion Pro cameras. The ground-truth of this dataset is the 3D coordinates of 15 body joints.
Source: V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth MapPaper | Code | Results | Date | Stars |
---|