1 code implementation • 23 Feb 2023 • Wenhao Li, Baoxiang Wang, Shanchao Yang, Hongyuan Zha
We propose a simple and effective RL method, Diverse Policy Optimization (DPO), to model the policies in structured action space as the energy-based models (EBM) by following the probabilistic RL framework.
1 code implementation • 18 Oct 2021 • Shanchao Yang, Kaili Ma, Baoxiang Wang, Tianshu Yu, Hongyuan Zha
In this case, GNNs can barely learn useful information, resulting in prohibitive difficulty in making actions for successively rewiring edges under a reinforcement learning context.
no code implementations • 3 Mar 2020 • Shanchao Yang, Jing Liu, Kai Wu, Mingming Li
Differently, in this paper, we are interested in a novel problem named Time Series Conditioned Graph Generation: given an input multivariate time series, we aim to infer a target relation graph modeling the underlying interrelationships between time series with each node corresponding to each time series.