Search Results for author: HongYu Zhou

Found 22 papers, 8 papers with code

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

1 code implementation • 29 Apr 2024 • Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis, Constantin Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen, Charles Raude, Elliot Vincent, Lintao XU, HongYu Zhou, Loic Landrieu

Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms.

Memorization

Paper
Code

HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting

no code implementations • 19 Mar 2024 • HongYu Zhou, Jiahao Shao, Lu Xu, Dongfeng Bai, Weichao Qiu, Bingbing Liu, Yue Wang, Andreas Geiger, Yiyi Liao

Holistic understanding of urban scenes based on RGB images is a challenging yet important problem.

Novel View Synthesis Scene Understanding

Paper
Add Code

Safe Non-Stochastic Control of Control-Affine Systems: An Online Convex Optimization Approach

no code implementations • 28 Sep 2023 • HongYu Zhou, Yichen Song, Vasileios Tzoumas

We study how to safely control nonlinear control-affine systems that are corrupted with bounded non-stochastic noise, i. e., noise that is unknown a priori and that is not necessarily governed by a stochastic model.

Collision Avoidance

Paper
Add Code

DreamLLM: Synergistic Multimodal Comprehension and Creation

1 code implementation • 20 Sep 2023 • Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi

This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.

Ranked #1 on Visual Question Answering on MMBench (GPT-3.5 score metric)

multimodal generation Visual Question Answering +2

313

Paper
Code

Safe Non-Stochastic Control of Linear Dynamical Systems

1 code implementation • 23 Aug 2023 • HongYu Zhou, Vasileios Tzoumas

We study the problem of \textit{safe control of linear dynamical systems corrupted with non-stochastic noise}, and provide an algorithm that guarantees (i) zero constraint violation of convex time-varying constraints, and (ii) bounded dynamic regret, \ie bounded suboptimality against an optimal clairvoyant controller that knows the future noise a priori.

Collision Avoidance

Paper
Code

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

no code implementations • 18 Jul 2023 • Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Haoran Wei, HongYu Zhou, Jianjian Sun, Yuang Peng, Runpei Dong, Chunrui Han, Xiangyu Zhang

Based on precise referring instruction, we propose ChatSpot, a unified end-to-end multimodal large language model that supports diverse forms of interactivity including mouse clicks, drag-and-drop, and drawing boxes, which provides a more flexible and seamless interactive experience.

Instruction Following Language Modelling +1

Paper
Add Code

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection

no code implementations • 30 Jun 2023 • Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, HongYu Zhou, Tiezheng Mao, Zeming Li, Osamu Yoshie

In light of the success of sample mining techniques in 2D object detection, we propose a simple yet effective mining strategy for improving depth perception in 3D object detection.

3D Object Detection Depth Estimation +3

Paper
Add Code

Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

no code implementations • 10 Mar 2023 • Chunrui Han, Jinrong Yang, Jianjian Sun, Zheng Ge, Runpei Dong, HongYu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang

In this paper, we explore an embarrassingly simple long-term recurrent fusion strategy built upon the LSS-based methods and find it already able to enjoy the merits from both sides, i. e., rich long-term information and efficient fusion pipeline.

motion prediction object-detection +1

Paper
Add Code

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

2 code implementations • 9 Feb 2023 • HongYu Zhou, Xin Zhou, Zhiwei Zeng, Lingzi Zhang, Zhiqi Shen

Recommendation systems have become popular and effective tools to help users discover their interesting items by modeling the user preference and item property based on implicit interactions (e. g., purchasing and clicking).

Multimodal Recommendation

260

Paper
Code

Enhancing Dyadic Relations with Homogeneous Graphs for Multimodal Recommendation

1 code implementation • 28 Jan 2023 • HongYu Zhou, Xin Zhou, Lingzi Zhang, Zhiqi Shen

On top of the finding, we propose a model that enhances the dyadic relations by learning Dual RepresentAtions of both users and items via constructing homogeneous Graphs for multimOdal recommeNdation.

Graph Learning Multimodal Recommendation

Paper
Code

Efficient Online Learning with Memory via Frank-Wolfe Optimization: Algorithms with Bounded Dynamic Regret and Applications to Control

no code implementations • 2 Jan 2023 • HongYu Zhou, Zirui Xu, Vasileios Tzoumas

In this paper, we enable projection-free online learning within the framework of Online Convex Optimization with Memory (OCO-M) -- OCO-M captures how the history of decisions affects the current outcome by allowing the online learning loss functions to depend on both current and past decisions.

Time Series Time Series Prediction

Paper
Add Code

MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

2 code implementations • ICCV 2023 • HongYu Zhou, Zheng Ge, Zeming Li, Xiangyu Zhang

This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT.

Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU lane - 224x480 - 100x100 at 0.5 metric)

Autonomous Driving Bird's-Eye View Semantic Segmentation +2

664

Paper
Code

Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot Coordination

no code implementations • 26 Sep 2022 • Zirui Xu, HongYu Zhou, Vasileios Tzoumas

We are motivated by the future of autonomy that involves multiple robots coordinating in dynamic, unstructured, and adversarial environments to complete complex tasks such as target tracking, environmental mapping, and area monitoring.

Paper
Add Code

PersDet: Monocular 3D Detection in Perspective Bird's-Eye-View

no code implementations • 19 Aug 2022 • HongYu Zhou, Zheng Ge, Weixin Mao, Zeming Li

To address this problem, we revisit the generation of BEV representation and propose detecting objects in perspective BEV -- a new BEV representation that does not require feature sampling.

Autonomous Driving object-detection +1

Paper
Add Code

Safe Control of Partially-Observed Linear Time-Varying Systems with Minimal Worst-Case Dynamic Regret

no code implementations • 18 Aug 2022 • HongYu Zhou, Vasileios Tzoumas

We present safe control of partially-observed linear time-varying systems in the presence of unknown and unpredictable process and measurement noise.

Paper
Add Code

Bootstrap Latent Representations for Multi-modal Recommendation

2 code implementations • 13 Jul 2022 • Xin Zhou, HongYu Zhou, Yong liu, Zhiwei Zeng, Chunyan Miao, Pengwei Wang, Yuan You, Feijun Jiang

Besides the user-item interaction graph, existing state-of-the-art methods usually use auxiliary graphs (e. g., user-user or item-item relation graph) to augment the learned representations of users and/or items.

260

Paper
Code

Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection

2 code implementations • 6 Jul 2022 • HongYu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun

To date, the most powerful semi-supervised object detectors (SS-OD) are based on pseudo-boxes, which need a sequence of post-processing with fine-tuned hyper-parameters.

Ranked #4 on Semi-Supervised Object Detection on COCO 100% labeled data

object-detection Object Detection +2

12,116

Paper
Code

Transformer for Polyp Detection

no code implementations • 14 Oct 2021 • Shijie Liu, HongYu Zhou, Xiaozhou Shi, Junwen Pan

In recent years, as the Transformer has performed increasingly well on NLP tasks, many researchers have ported the Transformer structure to vision tasks , bridging the gap between NLP and CV tasks.

Paper
Add Code

Grouptron: Dynamic Multi-Scale Graph Convolutional Networks for Group-Aware Dense Crowd Trajectory Forecasting

no code implementations • 29 Sep 2021 • Rui Zhou, HongYu Zhou, Huidong Gao, Masayoshi Tomizuka, Jiachen Li, Zhuo Xu

Accurate, long-term forecasting of pedestrian trajectories in highly dynamic and interactive scenes is a long-standing challenge.

Trajectory Forecasting

Paper
Add Code

RECIST-Net: Lesion detection via grouping keypoints on RECIST-based annotation

no code implementations • 19 Jul 2021 • Cong Xie, Shilei Cao, Dong Wei, HongYu Zhou, Kai Ma, Xianli Zhang, Buyue Qian, Liansheng Wang, Yefeng Zheng

Universal lesion detection in computed tomography (CT) images is an important yet challenging task due to the large variations in lesion type, size, shape, and appearance.

Computed Tomography (CT) Lesion Detection +1

Paper
Add Code

A Guidance and Maneuvering Control System Design with Anti-collision Using Stream Functions with Vortex Flows for Autonomous Marine Vessels

no code implementations • 4 Jun 2021 • HongYu Zhou, Zhengru Ren, Mathias Marley, Roger Skjetne

Autonomous marine vessels are expected to avoid inter-vessel collisions and comply with the international regulations for safe voyages.

Collision Avoidance

Paper
Add Code

High-Performance FPGA-based Accelerator for Bayesian Neural Networks

no code implementations • 12 May 2021 • Hongxiang Fan, Martin Ferianc, Miguel Rodrigues, HongYu Zhou, Xinyu Niu, Wayne Luk

Neural networks (NNs) have demonstrated their potential in a wide range of applications such as image recognition, decision making or recommendation systems.

Autonomous Vehicles Bayesian Inference +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.