Probabilistic dual heuristic programming-based adaptive critic

Randa Herzallah*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Adaptive critic (AC) methods have common roots as generalisations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, non-linear and non-stationary environments. In this study, a novel probabilistic dual heuristic programming (DHP)-based AC controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) AC method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterised by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the probabilistic critic network is then calculated and shown to be equal to the analytically derived correct value. Full derivation of the Riccati solution for this non-standard stochastic linear quadratic control problem is also provided. Moreover, the performance of the proposed probabilistic controller is demonstrated on linear and non-linear control examples.

Original languageEnglish
Pages (from-to)227-239
Number of pages13
JournalInternational Journal of Systems Science
Volume41
Issue number2
Early online date12 Nov 2009
DOIs
Publication statusPublished - 1 Feb 2010

Keywords

  • Adaptive critic
  • Dual heuristic programming
  • Functional uncertainty

Fingerprint

Dive into the research topics of 'Probabilistic dual heuristic programming-based adaptive critic'. Together they form a unique fingerprint.

Cite this