no code implementations • 3 Mar 2024 • Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Gang Yu, Jiayuan Fan, Tao Chen
The development of multimodal models has marked a significant step forward in how machines understand videos.
Video Understanding