1 code implementation • ECCV 2020 • Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky
We find that modern captioning systems return higher likelihoods for incorrect distractor sentences compared to ground truth captions, and that evaluation metrics like SPICE can be 'topped' using simple captioning systems relying on object detectors.