1 code implementation • 20 Dec 2023 • Zhangbin Li, Dan Guo, Jinxing Zhou, Jing Zhang, Meng Wang
These selected pairs are constrained to have larger similarity values than the mismatched pairs.
Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +4