Self-supervised Pretraining of Visual Features in the Wild

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: https://github.com/facebookresearch/vissl

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Semi-Supervised Image Classification ImageNet - 10% labeled data SEER Small (RegNetY-128GF) Top 1 Accuracy 76.7% # 15
Semi-Supervised Image Classification ImageNet - 10% labeled data SEER Large (RegNetY-256GF) Top 1 Accuracy 77.9% # 11
Semi-Supervised Image Classification ImageNet - 1% labeled data SEER Large (RegNetY-256GF) Top 1 Accuracy 60.5% # 31
Semi-Supervised Image Classification ImageNet - 1% labeled data SEER Small (RegNetY-128GF) Top 1 Accuracy 57.5% # 36
Self-Supervised Image Classification ImageNet (finetuned) SEER (RegNetY-256GF) Number of Params 1.3B # 50
Top 1 Accuracy 84.2% # 33
Self-Supervised Image Classification ImageNet (finetuned) SEER (RegNetY-128GF) Number of Params 693M # 5
Top 1 Accuracy 83.8% # 42
Image Classification Places205 RegNetY-128GF (Supervised) Top 1 Accuracy 62.7 # 9
Image Classification Places205 SEER Top 1 Accuracy 66.0 # 6

Methods