no code implementations • 19 Oct 2023 • Mengdi Xu, Peide Huang, Wenhao Yu, Shiqi Liu, Xilun Zhang, Yaru Niu, Tingnan Zhang, Fei Xia, Jie Tan, Ding Zhao
This paper investigates the feasibility of imbuing robots with the ability to creatively use tools in tasks that involve implicit physical constraints and long-term planning.
no code implementations • 17 Apr 2023 • Mengdi Xu, Yuchen Lu, Yikang Shen, Shun Zhang, Ding Zhao, Chuang Gan
To address this challenge, we propose a new framework, called Hyper-Decision Transformer (HDT), that can generalize to novel tasks from a handful of demonstrations in a data- and parameter-efficient manner.
no code implementations • 21 Jan 2023 • JieLin Qiu, William Han, Jiacheng Zhu, Mengdi Xu, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
The learned embeddings are evaluated on two downstream tasks: (1) automatic ECG diagnosis report generation, and (2) zero-shot cardiovascular disease detection.
no code implementations • 21 Oct 2022 • Mengdi Xu, Peide Huang, Yaru Niu, Visak Kumar, JieLin Qiu, Chao Fang, Kuan-Hui Lee, Xuewei Qi, Henry Lam, Bo Li, Ding Zhao
One key challenge for multi-task Reinforcement learning (RL) in practice is the absence of task indicators.
no code implementations • 21 Oct 2022 • Shiqi Liu, Mengdi Xu, Piede Huang, Yongkang Liu, Kentaro Oguchi, Ding Zhao
Continual reinforcement learning aims to sequentially learn a variety of tasks, retaining the ability to perform previously encountered tasks while simultaneously developing new policies for novel tasks.
1 code implementation • 18 Oct 2022 • Peide Huang, Mengdi Xu, Jiacheng Zhu, Laixi Shi, Fei Fang, Ding Zhao
Curriculum Reinforcement Learning (CRL) aims to create a sequence of tasks, starting from easy ones and gradually learning towards difficult tasks.
no code implementations • 10 Oct 2022 • JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin
Multimedia summarization with multimodal output (MSMO) is a recently explored application in language grounding.
no code implementations • 16 Sep 2022 • Mengdi Xu, Zuxin Liu, Peide Huang, Wenhao Ding, Zhepeng Cen, Bo Li, Ding Zhao
A trustworthy reinforcement learning algorithm should be competent in solving challenging real-world problems, including {robustly} handling uncertainties, satisfying {safety} constraints to avoid catastrophic failures, and {generalizing} to unseen scenarios during deployments.
1 code implementation • 10 Aug 2022 • William Han, JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Douglas Weber, Bo Li, Ding Zhao
In addition, we provide interpretations of the performance improvement: (1) feature distribution shows the effectiveness of the alignment module for discovering and encoding the relationship between EEG and language; (2) alignment weights show the influence of different language semantics as well as EEG frequency features; (3) brain topographical maps provide an intuitive demonstration of the connectivity in the brain regions.
no code implementations • 27 Jun 2022 • Mengdi Xu, Yikang Shen, Shun Zhang, Yuchen Lu, Ding Zhao, Joshua B. Tenenbaum, Chuang Gan
Humans can leverage prior experience and learn novel tasks from a handful of demonstrations.
no code implementations • 7 Apr 2022 • JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Bo Li, Ding Zhao, Hailin Jin
Multimedia summarization with multimodal output can play an essential role in real-world applications, i. e., automatically generating cover images and titles for news articles or providing introductions to online videos.
no code implementations • 19 Feb 2022 • Peide Huang, Mengdi Xu, Fei Fang, Ding Zhao
In this paper, we introduce a novel hierarchical formulation of robust RL - a general-sum Stackelberg game model called RRL-Stack - to formalize the sequential nature and provide extra flexibility for robust training.
no code implementations • 25 Jan 2022 • JieLin Qiu, Jiacheng Zhu, Mengdi Xu, Peide Huang, Michael Rosenberg, Douglas Weber, Emerson Liu, Ding Zhao
In this paper, we focus on a new method of data augmentation to solve the data imbalance problem within imbalanced ECG datasets to improve the robustness and accuracy of heart disease detection.
1 code implementation • 19 Jun 2021 • Mengdi Xu, Peide Huang, Fengpei Li, Jiacheng Zhu, Xuewei Qi, Kentaro Oguchi, Zhiyuan Huang, Henry Lam, Ding Zhao
Evaluating rare but high-stakes events is one of the main challenges in obtaining reliable reinforcement learning policies, especially in large or infinite state/action spaces where limited scalability dictates a prohibitively large number of testing iterations.
1 code implementation • 7 Feb 2021 • Jiacheng Zhu, Aritra Guha, Dat Do, Mengdi Xu, XuanLong Nguyen, Ding Zhao
We introduce a formulation of optimal transport problem for distributions on function spaces, where the stochastic map between functional domains can be partially represented in terms of an (infinite-dimensional) Hilbert-Schmidt operator mapping a Hilbert space of functions to another.
no code implementations • 2 Jan 2021 • Baiming Chen, Zuxin Liu, Jiacheng Zhu, Mengdi Xu, Wenhao Ding, Ding Zhao
The algorithm is evaluated in realistic safety-critical environments with non-stationary disturbances.
1 code implementation • NeurIPS 2020 • Mengdi Xu, Wenhao Ding, Jiacheng Zhu, Zuxin Liu, Baiming Chen, Ding Zhao
We propose a transition prior to account for the temporal dependencies in streaming data and update the mixture online via sequential variational inference.
1 code implementation • 11 May 2020 • Baiming Chen, Mengdi Xu, Liang Li, Ding Zhao
Action delays degrade the performance of reinforcement learning in many real-world systems.
1 code implementation • 11 May 2020 • Baiming Chen, Mengdi Xu, Zuxin Liu, Liang Li, Ding Zhao
We also test the proposed algorithm in traffic scenarios that require coordination of all autonomous vehicles to show the practical value of delay-awareness.
1 code implementation • 17 Sep 2019 • Wenhao Ding, Mengdi Xu, Ding Zhao
However, most of the data is collected in safe scenarios leading to the duplication of trajectories which are easy to be handled by currently developed algorithms.