Abstract
Adaptive critic (AC) methods have common roots as generalisations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, non-linear and non-stationary environments. In this study, a novel probabilistic dual heuristic programming (DHP)-based AC controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) AC method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterised by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the probabilistic critic network is then calculated and shown to be equal to the analytically derived correct value. Full derivation of the Riccati solution for this non-standard stochastic linear quadratic control problem is also provided. Moreover, the performance of the proposed probabilistic controller is demonstrated on linear and non-linear control examples.
Original language | English |
---|---|
Pages (from-to) | 227-239 |
Number of pages | 13 |
Journal | International Journal of Systems Science |
Volume | 41 |
Issue number | 2 |
Early online date | 12 Nov 2009 |
DOIs | |
Publication status | Published - 1 Feb 2010 |
Keywords
- Adaptive critic
- Dual heuristic programming
- Functional uncertainty