no code implementations • 6 Feb 2024 • Yuting Tang, Xin-Qiang Cai, Yao-Xiang Ding, Qiyu Wu, Guoqing Liu, Masashi Sugiyama
In Reinforcement Learning (RL), it is commonly assumed that an immediate reward signal is generated for each action taken by the agent, helping the agent maximize cumulative rewards to obtain the optimal policy.
1 code implementation • 26 Oct 2023 • Yifei Peng, Yu Jin, Zhexu Luo, Yao-Xiang Ding, Wang-Zhou Dai, Zhong Ren, Kun Zhou
There are two levels of symbol grounding problems among the core challenges: the first is symbol assignment, i. e. mapping latent factors of neural visual generators to semantic-meaningful symbolic factors from the reasoning systems by learning from limited labeled data.
no code implementations • 17 Jun 2021 • Xin-Qiang Cai, Yao-Xiang Ding, Zi-Xuan Chen, Yuan Jiang, Masashi Sugiyama, Zhi-Hua Zhou
In many real-world imitation learning tasks, the demonstrator and the learner have to act under different observation spaces.
no code implementations • 9 Sep 2019 • Xin-Qiang Cai, Yao-Xiang Ding, Yuan Jiang, Zhi-Hua Zhou
One of the key issues for imitation learning lies in making policy learned from limited samples to generalize well in the whole state-action space.
no code implementations • NeurIPS 2018 • Yao-Xiang Ding, Zhi-Hua Zhou
In many real-world learning tasks, it is hard to directly optimize the true performance measures, meanwhile choosing the right surrogate objectives is also difficult.
no code implementations • 1 Sep 2016 • Yao-Xiang Ding, Zhi-Hua Zhou
One of the fundamental problems in crowdsourcing is the trade-off between the number of the workers needed for high-accuracy aggregation and the budget to pay.