3 code implementations • CVPR 2022 • Youngwan Lee, Jonghee Kim, Jeff Willette, Sung Ju Hwang
While Convolutional Neural Networks (CNNs) have been the dominant architectures for such tasks, recently introduced Vision Transformers (ViTs) aim to replace them as a backbone.
Ranked #38 on Instance Segmentation on COCO minival