1 code implementation • 8 Apr 2024 • Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim
However, existing LLM-based large multimodal models (e. g., Video-LLaMA, VideoChat) can only take in a limited number of frames for short video understanding.
Ranked #1 on Video Classification on COIN
1 code implementation • 26 Mar 2024 • Junke Wang, Dongdong Chen, Chong Luo, Bo He, Lu Yuan, Zuxuan Wu, Yu-Gang Jiang
The core of video understanding tasks, such as recognition, captioning, and tracking, is to automatically detect objects or actions in a video and analyze their temporal evolution.
2 code implementations • 13 Nov 2023 • Junke Wang, Lingchen Meng, Zejia Weng, Bo He, Zuxuan Wu, Yu-Gang Jiang
Existing visual instruction tuning methods typically prompt large language models with textual descriptions to generate instruction-following data.
Ranked #35 on Visual Question Answering on MM-Vet
no code implementations • ICCV 2023 • Nirat Saini, Hanyu Wang, Archana Swaminathan, Vinoj Jayasundara, Bo He, Kamal Gupta, Abhinav Shrivastava
Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions.
no code implementations • CVPR 2023 • Bo He, Xitong Yang, Hanyu Wang, Zuxuan Wu, Hao Chen, Shuaiyi Huang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e. g., NeRV, E-NeRV).
1 code implementation • CVPR 2023 • Bo He, Jun Wang, JieLin Qiu, Trung Bui, Abhinav Shrivastava, Zhaowen Wang
The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.
Ranked #3 on Supervised Video Summarization on SumMe
Extractive Text Summarization Supervised Video Summarization
no code implementations • 18 Nov 2022 • Hao Chen, Matt Gwilliam, Bo He, Ser-Nam Lim, Abhinav Shrivastava
We match the performance of NeRV, a state-of-the-art implicit neural representation, on the reconstruction task for frames seen during training while far surpassing for frames that are skipped during training (unseen images).
1 code implementation • 15 Aug 2022 • Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava
In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations.
no code implementations • 24 May 2022 • Bo He, Xiang Song, Vincent Gao, Christos Faloutsos
It outperforms the lightgbm2 by up to 34 pcp ROC-AUC in a cold start case when a new seller sells a new product .
1 code implementation • CVPR 2022 • Bo He, Xitong Yang, Le Kang, Zhiyu Cheng, Xin Zhou, Abhinav Shrivastava
Without the boundary information of action segments, existing methods mostly rely on multiple instance learning (MIL), where the predictions of unlabeled instances (i. e., video snippets) are supervised by classifying labeled bags (i. e., untrimmed videos).
3 code implementations • NeurIPS 2021 • Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava
In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H. 264, HEVC \etc).
Ranked #6 on Video Reconstruction on UVG
2 code implementations • 28 Jun 2021 • Xin Zhou, Le Kang, Zhiyu Cheng, Bo He, Jingyu Xin
With rapidly evolving internet technologies and emerging tools, sports related videos generated online are increasing at an unprecedentedly fast pace.
no code implementations • 14 Apr 2021 • Jie Huang, Rongshun Juan, Randy Gomez, Keisuke Nakamura, Qixin Sha, Bo He, Guangliang Li
Deep reinforcement learning (DRL) has achieved great successes in many simulated tasks.
no code implementations • 15 Dec 2020 • Bo He, Xitong Yang, Zuxuan Wu, Hao Chen, Ser-Nam Lim, Abhinav Shrivastava
To this end, we introduce Global Temporal Attention (GTA), which performs global temporal attention on top of spatial attention in a decoupled manner.
no code implementations • 10 Jan 2020 • Qilei Zhang, Jinying Lin, Qixin Sha, Bo He, Guangliang Li
In this paper, we proposed a deep interactive reinforcement learning method for path following of AUV by combining the advantages of deep reinforcement learning and interactive RL.
no code implementations • 18 Apr 2019 • Guangliang Li, Randy Gomez, Keisuke Nakamura, Jinying Lin, Qilei Zhang, Bo He
Our results show that learning from demonstration can allow a TAMER agent to learn a roughly optimal policy up to the deepest search and encourage the agent to explore along the optimal path.
no code implementations • 23 Sep 2014 • Bo Han, Bo He, Tingting Sun, Mengmeng Ma, Amaury Lendasse
By employing hierarchical feature selection, we can compress the scale and dimension of global dictionary, which directly contributes to the decrease of computational cost in sparse representation that our approach is strongly rooted in.
no code implementations • 13 Aug 2014 • Yang Liu, Bo He, Diya Dong, Yue Shen, Tianhong Yan, Rui Nian, Amaury Lendase
Second, an adaptive selective ensemble framework for online learning is designed to balance the robustness and complexity of the algorithm.
no code implementations • 9 Aug 2014 • Bo Han, Bo He, Rui Nian, Mengmeng Ma, Shujing Zhang, Minghui Li, Amaury Lendasse
Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data.
no code implementations • 9 Aug 2014 • Bo Han, Bo He, Mengmeng Ma, Tingting Sun, Tianhong Yan, Amaury Lendasse
It becomes a potential framework to solve robustness issue of ELM for high-dimensional blended data in the future.