Enhanced dynamic sign language recognition using slowfast networks
In this paper, we use the SlowFast Networks developed by the Facebook research team to enhance the accuracy of dynamic sign language recognition. Firstly, we prepared the Word-Level American Sign Language (WLASL) dataset so each sign can be considered an action. We used the pre-trained SLOWFAST_8×8_R50 model provided on the official PySlowFast Github repository to initialize the weights of our model and fine-tune using the WLASL dataset and performed a parameter sweeping to fit the Dynamic Sign Language Recognition task. Through this transfer learning approach, we introduced a new state-of-the-art accuracy on the WLASL300 (300 words e.g., 300 classes) dataset with an improvement of 23.2 % top-1 accuracy compared to the previous state-of-the-art introduced in the WLASL paper using an I3D model. The top-1 accuracy was improved from 56.14% to 79.34% and the top-5 accuracy from 79.94% to 90.31%.
PDF