1 code implementation • 29 Apr 2024 • Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis, Constantin Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen, Charles Raude, Elliot Vincent, Lintao XU, HongYu Zhou, Loic Landrieu
Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms.
no code implementations • 19 Mar 2024 • HongYu Zhou, Jiahao Shao, Lu Xu, Dongfeng Bai, Weichao Qiu, Bingbing Liu, Yue Wang, Andreas Geiger, Yiyi Liao
Holistic understanding of urban scenes based on RGB images is a challenging yet important problem.
no code implementations • 28 Sep 2023 • HongYu Zhou, Yichen Song, Vasileios Tzoumas
We study how to safely control nonlinear control-affine systems that are corrupted with bounded non-stochastic noise, i. e., noise that is unknown a priori and that is not necessarily governed by a stochastic model.
1 code implementation • 20 Sep 2023 • Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi
This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.
Ranked #1 on Visual Question Answering on MMBench (GPT-3.5 score metric)
1 code implementation • 23 Aug 2023 • HongYu Zhou, Vasileios Tzoumas
We study the problem of \textit{safe control of linear dynamical systems corrupted with non-stochastic noise}, and provide an algorithm that guarantees (i) zero constraint violation of convex time-varying constraints, and (ii) bounded dynamic regret, \ie bounded suboptimality against an optimal clairvoyant controller that knows the future noise a priori.
no code implementations • 18 Jul 2023 • Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Haoran Wei, HongYu Zhou, Jianjian Sun, Yuang Peng, Runpei Dong, Chunrui Han, Xiangyu Zhang
Based on precise referring instruction, we propose ChatSpot, a unified end-to-end multimodal large language model that supports diverse forms of interactivity including mouse clicks, drag-and-drop, and drawing boxes, which provides a more flexible and seamless interactive experience.
no code implementations • 30 Jun 2023 • Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, HongYu Zhou, Tiezheng Mao, Zeming Li, Osamu Yoshie
In light of the success of sample mining techniques in 2D object detection, we propose a simple yet effective mining strategy for improving depth perception in 3D object detection.
no code implementations • 10 Mar 2023 • Chunrui Han, Jinrong Yang, Jianjian Sun, Zheng Ge, Runpei Dong, HongYu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang
In this paper, we explore an embarrassingly simple long-term recurrent fusion strategy built upon the LSS-based methods and find it already able to enjoy the merits from both sides, i. e., rich long-term information and efficient fusion pipeline.
2 code implementations • 9 Feb 2023 • HongYu Zhou, Xin Zhou, Zhiwei Zeng, Lingzi Zhang, Zhiqi Shen
Recommendation systems have become popular and effective tools to help users discover their interesting items by modeling the user preference and item property based on implicit interactions (e. g., purchasing and clicking).
1 code implementation • 28 Jan 2023 • HongYu Zhou, Xin Zhou, Lingzi Zhang, Zhiqi Shen
On top of the finding, we propose a model that enhances the dyadic relations by learning Dual RepresentAtions of both users and items via constructing homogeneous Graphs for multimOdal recommeNdation.
no code implementations • 2 Jan 2023 • HongYu Zhou, Zirui Xu, Vasileios Tzoumas
In this paper, we enable projection-free online learning within the framework of Online Convex Optimization with Memory (OCO-M) -- OCO-M captures how the history of decisions affects the current outcome by allowing the online learning loss functions to depend on both current and past decisions.
2 code implementations • ICCV 2023 • HongYu Zhou, Zheng Ge, Zeming Li, Xiangyu Zhang
This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT.
Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU lane - 224x480 - 100x100 at 0.5 metric)
no code implementations • 26 Sep 2022 • Zirui Xu, HongYu Zhou, Vasileios Tzoumas
We are motivated by the future of autonomy that involves multiple robots coordinating in dynamic, unstructured, and adversarial environments to complete complex tasks such as target tracking, environmental mapping, and area monitoring.
no code implementations • 19 Aug 2022 • HongYu Zhou, Zheng Ge, Weixin Mao, Zeming Li
To address this problem, we revisit the generation of BEV representation and propose detecting objects in perspective BEV -- a new BEV representation that does not require feature sampling.
no code implementations • 18 Aug 2022 • HongYu Zhou, Vasileios Tzoumas
We present safe control of partially-observed linear time-varying systems in the presence of unknown and unpredictable process and measurement noise.
2 code implementations • 13 Jul 2022 • Xin Zhou, HongYu Zhou, Yong liu, Zhiwei Zeng, Chunyan Miao, Pengwei Wang, Yuan You, Feijun Jiang
Besides the user-item interaction graph, existing state-of-the-art methods usually use auxiliary graphs (e. g., user-user or item-item relation graph) to augment the learned representations of users and/or items.
2 code implementations • 6 Jul 2022 • HongYu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun
To date, the most powerful semi-supervised object detectors (SS-OD) are based on pseudo-boxes, which need a sequence of post-processing with fine-tuned hyper-parameters.
no code implementations • 14 Oct 2021 • Shijie Liu, HongYu Zhou, Xiaozhou Shi, Junwen Pan
In recent years, as the Transformer has performed increasingly well on NLP tasks, many researchers have ported the Transformer structure to vision tasks , bridging the gap between NLP and CV tasks.
no code implementations • 29 Sep 2021 • Rui Zhou, HongYu Zhou, Huidong Gao, Masayoshi Tomizuka, Jiachen Li, Zhuo Xu
Accurate, long-term forecasting of pedestrian trajectories in highly dynamic and interactive scenes is a long-standing challenge.
no code implementations • 19 Jul 2021 • Cong Xie, Shilei Cao, Dong Wei, HongYu Zhou, Kai Ma, Xianli Zhang, Buyue Qian, Liansheng Wang, Yefeng Zheng
Universal lesion detection in computed tomography (CT) images is an important yet challenging task due to the large variations in lesion type, size, shape, and appearance.
no code implementations • 4 Jun 2021 • HongYu Zhou, Zhengru Ren, Mathias Marley, Roger Skjetne
Autonomous marine vessels are expected to avoid inter-vessel collisions and comply with the international regulations for safe voyages.
no code implementations • 12 May 2021 • Hongxiang Fan, Martin Ferianc, Miguel Rodrigues, HongYu Zhou, Xinyu Niu, Wayne Luk
Neural networks (NNs) have demonstrated their potential in a wide range of applications such as image recognition, decision making or recommendation systems.