Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

We present Sparse R-CNN, a purely sparse method for object detection in images. Existing works on object detection heavily rely on dense object candidates, such as $k$ anchor boxes pre-defined on all grids of image feature map of size $H\times W$. In our method, however, a fixed sparse set of learned object proposals, total length of $N$, are provided to object recognition head to perform classification and location. By eliminating $HWk$ (up to hundreds of thousands) hand-designed object candidates to $N$ (e.g. 100) learnable proposals, Sparse R-CNN completely avoids all efforts related to object candidates design and many-to-one label assignment. More importantly, final predictions are directly output without non-maximum suppression post-procedure. Sparse R-CNN demonstrates accuracy, run-time and training convergence performance on par with the well-established detector baselines on the challenging COCO dataset, e.g., achieving 45.0 AP in standard $3\times$ training schedule and running at 22 fps using ResNet-50 FPN model. We hope our work could inspire re-thinking the convention of dense prior in object detectors. The code is available at: https://github.com/PeizeSun/SparseR-CNN.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
2D Object Detection CeyMo Sparse R-CNN mAP 47.3 # 5
Object Detection COCO minival Sparse R-CNN (ResNet-101, learnable proposals, random crop aug, FPN) box AP 45.6 # 104
AP50 64.6 # 38
AP75 49.5 # 31
APS 28.3 # 25
APM 48.3 # 25
APL 61.6 # 25
Object Detection COCO minival Sparse R-CNN (ResNet-50, learnable proposals, random crop aug, FPN) box AP 44.5 # 117
AP50 63.4 # 50
AP75 48.2 # 39
APS 26.9 # 32
APM 47.2 # 34
APL 59.5 # 32
Object Detection COCO minival Sparse R-CNN (ResNet-101, FPN) box AP 43.5 # 125
AP50 62.1 # 61
AP75 47.2 # 44
APS 26.1 # 40
APM 46.3 # 41
APL 59.7 # 29
Object Detection COCO minival Sparse R-CNN (ResNet-50, FPN) box AP 42.3 # 140
AP50 61.2 # 68
AP75 45.7 # 57
APS 26.7 # 34
APM 44.6 # 54
APL 57.6 # 46
2D Object Detection SARDet-100K Sparse R-CNN box mAP 38.1 # 12

Methods