no code implementations • 8 May 2024 • Huy Quang Ung, Hao Niu, Minh-Son Dao, Shinya Wada, Atsunori Minamikawa
Traffic predictions play a crucial role in intelligent transportation systems.
no code implementations • 27 Sep 2023 • Yanan Wang, Donghuo Zeng, Shinya Wada, Satoshi Kurihara
In this work, to achieve high efficiency-performance multimodal transfer learning, we propose VideoAdviser, a video knowledge distillation method to transfer multimodal knowledge of video-enhanced prompts from a multimodal fundamental model (teacher) to a specific modal fundamental model (student).
no code implementations • ICCV 2023 • Yanan Wang, Michihiro Yasunaga, Hongyu Ren, Shinya Wada, Jure Leskovec
Visual question answering (VQA) requires systems to perform concept-level reasoning by unifying unstructured (e. g., the context in question and answer; "QA context") and structured (e. g., knowledge graph for the QA context and scene; "concept graph") multimodal knowledge.