Bottom-up Object Detection by Grouping Extreme and Center Points

With the advent of deep learning, object detection drifted from a bottom-up to a top-down recognition problem. State of the art algorithms enumerate a near-exhaustive list of object locations and classify each into: object or not. In this paper, we show that bottom-up approaches still perform competitively. We detect four extreme points (top-most, left-most, bottom-most, right-most) and one center point of objects using a standard keypoint estimation network. We group the five keypoints into a bounding box if they are geometrically aligned. Object detection is then a purely appearance-based keypoint estimation problem, without region classification or implicit feature learning. The proposed method performs on-par with the state-of-the-art region based detection methods, with a bounding box AP of 43.2% on COCO test-dev. In addition, our estimated extreme points directly span a coarse octagonal mask, with a COCO Mask AP of 18.9%, much better than the Mask AP of vanilla bounding boxes. Extreme point guided segmentation further improves this to 34.6% Mask AP.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Object Detection COCO minival ExtremeNet (Hourglass-104, multi-scale) box AP 43.3 # 128
AP50 59.6 # 79
AP75 46.8 # 49
APS 25.7 # 44
APM 46.6 # 39
APL 59.4 # 35
Object Detection COCO minival ExtremeNet (Hourglass-104, single-scale) box AP 40.3 # 164
AP50 55.1 # 100
AP75 43.7 # 73
APS 21.6 # 73
APM 44.0 # 58
APL 56.1 # 53
Object Detection COCO test-dev ExtremeNet (Hourglass-104, single-scale) box mAP 40.2 # 190
AP50 55.5 # 152
AP75 43.2 # 137
APS 20.4 # 133
APM 43.2 # 123
APL 53.1 # 118
Hardware Burden 180G # 1
Operations per network pass None # 1
Object Detection COCO test-dev ExtremeNet (Hourglass-104, multi-scale) box mAP 43.7 # 151
AP50 60.5 # 124
AP75 47.0 # 100
APS 24.1 # 102
APM 46.9 # 87
APL 57.6 # 74
Hardware Burden 180G # 1
Operations per network pass None # 1

Methods