Search Results for author: Zhixuan Shen

Found 1 papers, 1 papers with code

Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answering

1 code implementation14 Mar 2024 Zhixuan Shen, Haonan Luo, Sijia Li, Tianrui Li

Scene-Text Visual Question Answering (ST-VQA) aims to understand scene text in images and answer questions related to the text content.

Optical Character Recognition Optical Character Recognition (OCR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.