Abstract
The problem of evaluating different learning rules and other statistical estimators is analysed. A new general theory of statistical inference is developed by combining Bayesian decision theory with information geometry. It is coherent and invariant. For each sample a unique ideal estimate exists and is given by an average over the posterior. An optimal estimate within a model is given by a projection of the ideal estimate. The ideal estimate is a sufficient statistic of the posterior, so practical learning rules are functions of the ideal estimator. If the sole purpose of learning is to extract information from the data, the learning rule must also approximate the ideal estimator. This framework is applicable to both Bayesian and non-Bayesian methods, with arbitrary statistical models, and to supervised, unsupervised and reinforcement learning schemes.
Original language | English |
---|---|
Pages (from-to) | 28-31 |
Number of pages | 4 |
Journal | Neural Processing Letters |
Volume | 2 |
Issue number | 6 |
Publication status | Published - Dec 1995 |
Bibliographical note
Copyright of Springer Verlag. The original publication is available at www.springerlink.comKeywords
- learning rules
- statistical estimators
- statistical inference
- decision theory
- information geometry
- Bayesian
- non-Bayesian