Probabilistic dual heuristic programming-based adaptive critic

Randa Herzallah

doi:10.1080/00207720903045767

Probabilistic dual heuristic programming-based adaptive critic

Randa Herzallah^*

^*Corresponding author for this work

College of Engineering and Physical Sciences

Research output: Contribution to journal › Article › peer-review

Abstract

Adaptive critic (AC) methods have common roots as generalisations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, non-linear and non-stationary environments. In this study, a novel probabilistic dual heuristic programming (DHP)-based AC controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) AC method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterised by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the probabilistic critic network is then calculated and shown to be equal to the analytically derived correct value. Full derivation of the Riccati solution for this non-standard stochastic linear quadratic control problem is also provided. Moreover, the performance of the proposed probabilistic controller is demonstrated on linear and non-linear control examples.

Original language	English
Pages (from-to)	227-239
Number of pages	13
Journal	International Journal of Systems Science
Volume	41
Issue number	2
Early online date	12 Nov 2009
DOIs	https://doi.org/10.1080/00207720903045767
Publication status	Published - 1 Feb 2010

Keywords

Adaptive critic
Dual heuristic programming
Functional uncertainty

Access to Document

10.1080/00207720903045767

Cite this

@article{5cedd13dc3a247eea0deb1a17c22db97,

title = "Probabilistic dual heuristic programming-based adaptive critic",

abstract = "Adaptive critic (AC) methods have common roots as generalisations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, non-linear and non-stationary environments. In this study, a novel probabilistic dual heuristic programming (DHP)-based AC controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) AC method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterised by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the probabilistic critic network is then calculated and shown to be equal to the analytically derived correct value. Full derivation of the Riccati solution for this non-standard stochastic linear quadratic control problem is also provided. Moreover, the performance of the proposed probabilistic controller is demonstrated on linear and non-linear control examples.",

keywords = "Adaptive critic, Dual heuristic programming, Functional uncertainty",

author = "Randa Herzallah",

year = "2010",

month = feb,

day = "1",

doi = "10.1080/00207720903045767",

language = "English",

volume = "41",

pages = "227--239",

journal = "International Journal of Systems Science",

issn = "0020-7721",

publisher = "Taylor & Francis",

number = "2",

}

TY - JOUR

T1 - Probabilistic dual heuristic programming-based adaptive critic

AU - Herzallah, Randa

PY - 2010/2/1

Y1 - 2010/2/1

N2 - Adaptive critic (AC) methods have common roots as generalisations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, non-linear and non-stationary environments. In this study, a novel probabilistic dual heuristic programming (DHP)-based AC controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) AC method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterised by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the probabilistic critic network is then calculated and shown to be equal to the analytically derived correct value. Full derivation of the Riccati solution for this non-standard stochastic linear quadratic control problem is also provided. Moreover, the performance of the proposed probabilistic controller is demonstrated on linear and non-linear control examples.

AB - Adaptive critic (AC) methods have common roots as generalisations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, non-linear and non-stationary environments. In this study, a novel probabilistic dual heuristic programming (DHP)-based AC controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) AC method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterised by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the probabilistic critic network is then calculated and shown to be equal to the analytically derived correct value. Full derivation of the Riccati solution for this non-standard stochastic linear quadratic control problem is also provided. Moreover, the performance of the proposed probabilistic controller is demonstrated on linear and non-linear control examples.

KW - Adaptive critic

KW - Dual heuristic programming

KW - Functional uncertainty

UR - http://www.scopus.com/inward/record.url?scp=77951125117&partnerID=8YFLogxK

UR - https://www.tandfonline.com/doi/abs/10.1080/00207720903045767

U2 - 10.1080/00207720903045767

DO - 10.1080/00207720903045767

M3 - Article

AN - SCOPUS:77951125117

SN - 0020-7721

VL - 41

SP - 227

EP - 239

JO - International Journal of Systems Science

JF - International Journal of Systems Science

IS - 2

ER -

Probabilistic dual heuristic programming-based adaptive critic

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this