TASK |
DATASET |
MODEL |
METRIC NAME |
METRIC VALUE |
GLOBAL RANK |
REMOVE |
Self-Supervised Audio Classification
|
ESC-50
|
AVID
|
Top-1 Accuracy
|
89.2
|
# 3
|
|
Audio Classification
|
ESC-50
|
AVID
|
Top-1 Accuracy
|
89.2
|
# 17
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID+CMA (Modified R2+1D-18 on Kinetics)
|
Top-1 Accuracy
|
60.8
|
# 25
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID+CMA (Modified R2+1D-18 on Kinetics)
|
Pre-Training Dataset
|
Kinetics400 (Video+Audio)
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID+CMA (Modified R2+1D-18 on Kinetics)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID (Modified R2+1D-18 on Audioset)
|
Top-1 Accuracy
|
64.1
|
# 20
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID (Modified R2+1D-18 on Audioset)
|
Pre-Training Dataset
|
Audioset (Video+Audio)
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID (Modified R2+1D-18 on Audioset)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID+CMA (Modified R2+1D-18 on Audioset)
|
Top-1 Accuracy
|
64.7
|
# 16
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID+CMA (Modified R2+1D-18 on Audioset)
|
Pre-Training Dataset
|
Audioset (Video+Audio)
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID+CMA (Modified R2+1D-18 on Audioset)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID (Modified R2+1D-18 on Kinetics)
|
Top-1 Accuracy
|
59.9
|
# 27
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID (Modified R2+1D-18 on Kinetics)
|
Pre-Training Dataset
|
Kinetics400 (Video+Audio)
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51
|
AVID (Modified R2+1D-18 on Kinetics)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
HMDB51 (finetuned)
|
AVID
|
Top-1 Accuracy
|
64.7
|
# 8
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID (Modified R2+1D-18 on Kinetics)
|
3-fold Accuracy
|
86.9
|
# 27
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID (Modified R2+1D-18 on Kinetics)
|
Pre-Training Dataset
|
Kinetics400 (Audio+Video)
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID (Modified R2+1D-18 on Kinetics)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID+CMA (Modified R2+1D-18 on Kinetics)
|
3-fold Accuracy
|
87.5
|
# 26
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID+CMA (Modified R2+1D-18 on Kinetics)
|
Pre-Training Dataset
|
Kinetics400 (Audio+Video)
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID+CMA (Modified R2+1D-18 on Kinetics)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID (Modified R2+1D-18 on Audioset)
|
3-fold Accuracy
|
91.0
|
# 21
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID (Modified R2+1D-18 on Audioset)
|
Pre-Training Dataset
|
Audioset (Audio+Video)
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID (Modified R2+1D-18 on Audioset)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID+CMA (Modified R2+1D-18 on Audioset)
|
3-fold Accuracy
|
91.5
|
# 18
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID+CMA (Modified R2+1D-18 on Audioset)
|
Pre-Training Dataset
|
Audioset (Audio+Video)
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101
|
AVID+CMA (Modified R2+1D-18 on Audioset)
|
Frozen
|
false
|
# 1
|
|
Self-Supervised Action Recognition
|
UCF101 (finetuned)
|
AVID
|
3-fold Accuracy
|
91.5
|
# 7
|
|