no code implementations • 12 Dec 2023 • Hongyue Fan, Jingjie Ni, Fangfei Li
We address three questions: 1) finding control policies that achieve reachability with maximum probability under fixed, and particularly, varied finite time horizon, 2) leveraging prior knowledge to solve question 1) with faster convergence speed in scenarios where time is a variable framework, and 3) proposing an enhanced Q-learning (QL) method to efficiently address the aforementioned questions for large-scale PBCNs.