Search Results for author: Enxin Song

Found 3 papers, 2 papers with code

MovieChat+: Question-aware Sparse Memory for Long Video Question Answering

1 code implementation26 Apr 2024 Enxin Song, Wenhao Chai, Tian Ye, Jenq-Neng Hwang, Xi Li, Gaoang Wang

Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks.

2k Question Answering +2

Devil in the Number: Towards Robust Multi-modality Data Filter

no code implementations24 Sep 2023 Yichen Xu, Zihan Xu, Wenhao Chai, Zhonghan Zhao, Enxin Song, Gaoang Wang

In order to appropriately filter multi-modality data sets on a web-scale, it becomes crucial to employ suitable filtering methods to boost performance and reduce training costs.

Cannot find the paper you are looking for? You can Submit a new open access paper.