no code implementations • NAACL 2021 • Hiba Ahsan, Nikita Bhalla, Daivat Bhatt, Kaivankumar Shah
In this work, we propose altering AoANet, a state-of-the-art image captioning model, to leverage the text detected in the image as an input feature.
Image Captioning