Search Results for author: Bo He

Found 20 papers, 8 papers with code

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

1 code implementation • 8 Apr 2024 • Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim

However, existing LLM-based large multimodal models (e. g., Video-LLaMA, VideoChat) can only take in a limited number of frames for short video understanding.

Ranked #1 on Video Classification on COIN

Question Answering Video Captioning +4

122

Paper
Code

OmniVid: A Generative Framework for Universal Video Understanding

1 code implementation • 26 Mar 2024 • Junke Wang, Dongdong Chen, Chong Luo, Bo He, Lu Yuan, Zuxuan Wu, Yu-Gang Jiang

The core of video understanding tasks, such as recognition, captioning, and tracking, is to automatically detect objects or actions in a video and analyze their temporal evolution.

Action Recognition Decoder +5

Paper
Code

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

2 code implementations • 13 Nov 2023 • Junke Wang, Lingchen Meng, Zejia Weng, Bo He, Zuxuan Wu, Yu-Gang Jiang

Existing visual instruction tuning methods typically prompt large language models with textual descriptions to generate instruction-following data.

Ranked #35 on Visual Question Answering on MM-Vet

Instruction Following Visual Question Answering

182

Paper
Code

Chop & Learn: Recognizing and Generating Object-State Compositions

no code implementations • ICCV 2023 • Nirat Saini, Hanyu Wang, Archana Swaminathan, Vinoj Jayasundara, Bo He, Kamal Gupta, Abhinav Shrivastava

Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions.

Action Recognition Image Generation +1

Paper
Add Code

Towards Scalable Neural Representation for Diverse Videos

no code implementations • CVPR 2023 • Bo He, Xitong Yang, Hanyu Wang, Zuxuan Wu, Hao Chen, Shuaiyi Huang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e. g., NeRV, E-NeRV).

Action Recognition Video Compression

Paper
Add Code

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

1 code implementation • CVPR 2023 • Bo He, Jun Wang, JieLin Qiu, Trung Bui, Abhinav Shrivastava, Zhaowen Wang

The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.

Ranked #3 on Supervised Video Summarization on SumMe

Extractive Text Summarization Supervised Video Summarization

Paper
Code

CNeRV: Content-adaptive Neural Representation for Visual Data

no code implementations • 18 Nov 2022 • Hao Chen, Matt Gwilliam, Bo He, Ser-Nam Lim, Abhinav Shrivastava

We match the performance of NeRV, a state-of-the-art implicit neural representation, on the reconstruction task for frames seen during training while far surpassing for frames that are skipped during training (unseen images).

Data Compression Decoder

Paper
Add Code

Learning Semantic Correspondence with Sparse Annotations

1 code implementation • 15 Aug 2022 • Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava

In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations.

Denoising Semantic correspondence

Paper
Code

ColdGuess: A General and Effective Relational Graph Convolutional Network to Tackle Cold Start Cases

no code implementations • 24 May 2022 • Bo He, Xiang Song, Vincent Gao, Christos Faloutsos

It outperforms the lightgbm2 by up to 34 pcp ROC-AUC in a cold start case when a new seller sells a new product .

Paper
Add Code

ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization

1 code implementation • CVPR 2022 • Bo He, Xitong Yang, Le Kang, Zhiyu Cheng, Xin Zhou, Abhinav Shrivastava

Without the boundary information of action segments, existing methods mostly rely on multiple instance learning (MIL), where the predictions of unlabeled instances (i. e., video snippets) are supervised by classifying labeled bags (i. e., untrimmed videos).

Ranked #5 on Weakly Supervised Action Localization on ActivityNet-1.3

Weakly Supervised Temporal Action Localization

Paper
Code

NeRV: Neural Representations for Videos

3 code implementations • NeurIPS 2021 • Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H. 264, HEVC \etc).

Ranked #6 on Video Reconstruction on UVG

Denoising Neural Network Compression +3

278

Paper
Code

Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection

2 code implementations • 28 Jun 2021 • Xin Zhou, Le Kang, Zhiyu Cheng, Bo He, Jingyu Xin

With rapidly evolving internet technologies and emerging tools, sports related videos generated online are increasing at an unprecedentedly fast pace.

Action Recognition Action Spotting +3

Paper
Code

GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback

no code implementations • 14 Apr 2021 • Jie Huang, Rongshun Juan, Randy Gomez, Keisuke Nakamura, Qixin Sha, Bo He, Guangliang Li

Deep reinforcement learning (DRL) has achieved great successes in many simulated tasks.

Imitation Learning reinforcement-learning +1

Paper
Add Code

GTA: Global Temporal Attention for Video Action Understanding

no code implementations • 15 Dec 2020 • Bo He, Xitong Yang, Zuxuan Wu, Hao Chen, Ser-Nam Lim, Abhinav Shrivastava

To this end, we introduce Global Temporal Attention (GTA), which performs global temporal attention on top of spatial attention in a decoupled manner.

Action Recognition Action Understanding +1

Paper
Add Code

Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle

no code implementations • 10 Jan 2020 • Qilei Zhang, Jinying Lin, Qixin Sha, Bo He, Guangliang Li

In this paper, we proposed a deep interactive reinforcement learning method for path following of AUV by combining the advantages of deep reinforcement learning and interactive RL.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Improving Interactive Reinforcement Agent Planning with Human Demonstration

no code implementations • 18 Apr 2019 • Guangliang Li, Randy Gomez, Keisuke Nakamura, Jinying Lin, Qilei Zhang, Bo He

Our results show that learning from demonstration can allow a TAMER agent to learn a roughly optimal policy up to the deepest search and encourage the agent to explore along the optimal path.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

HSR: L1/2 Regularized Sparse Representation for Fast Face Recognition using Hierarchical Feature Selection

no code implementations • 23 Sep 2014 • Bo Han, Bo He, Tingting Sun, Mengmeng Ma, Amaury Lendasse

By employing hierarchical feature selection, we can compress the scale and dimension of global dictionary, which directly contributes to the decrease of computational cost in sparse representation that our approach is strongly rooted in.

Face Recognition feature selection +1

Paper
Add Code

Robust OS-ELM with a novel selective ensemble based on particle swarm optimization

no code implementations • 13 Aug 2014 • Yang Liu, Bo He, Diya Dong, Yue Shen, Tianhong Yan, Rui Nian, Amaury Lendase

Second, an adaptive selective ensemble framework for online learning is designed to balance the robustness and complexity of the algorithm.

General Classification

Paper
Add Code

LARSEN-ELM: Selective Ensemble of Extreme Learning Machines using LARS for Blended Data

no code implementations • 9 Aug 2014 • Bo Han, Bo He, Rui Nian, Mengmeng Ma, Shujing Zhang, Minghui Li, Amaury Lendasse

Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data.

Paper
Add Code

RMSE-ELM: Recursive Model based Selective Ensemble of Extreme Learning Machines for Robustness Improvement

no code implementations • 9 Aug 2014 • Bo Han, Bo He, Mengmeng Ma, Tingting Sun, Tianhong Yan, Amaury Lendasse

It becomes a potential framework to solve robustness issue of ELM for high-dimensional blended data in the future.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.