no code implementations • ICCV 2023 • Sangwon Kim, Dasom Ahn, Byoung Chul Ko
The 3D deformable transformer consists of three attention modules: 3D deformability, local joint stride, and temporal stride attention.
Ranked #8 on Action Recognition on NTU RGB+D
no code implementations • WACV 2023 • Dasom Ahn, Sangwon Kim, Hyunsu Hong, Byoung Chul Ko
In action recognition, although the combination of spatio-temporal videos and skeleton features can improve the recognition performance, a separate model and balancing feature representation for cross-modal data are required.
Ranked #1 on Action Recognition on Penn Action