Bayesian invariant measurements of generalisation for discrete distributions

Huaiyu Zhu; Richard Rohwer

Bayesian invariant measurements of generalisation for discrete distributions

Huaiyu Zhu, Richard Rohwer

Research output: Preprint or Working paper › Technical report

Abstract

Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.

Original language	English
Place of Publication	Birmingham, UK
Publisher	Aston University
Number of pages	23
ISBN (Print)	NCRG/4351
Publication status	Unpublished - 31 Aug 1995

Keywords

Neural network
learning rules
Bayesian framework
distribution

Access to Document

NCRG_95_003.pdf

Cite this

@techreport{39c28c467c86441fab4c508ac9c1d8e6,

title = "Bayesian invariant measurements of generalisation for discrete distributions",

abstract = "Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.",

keywords = "Neural network, learning rules, Bayesian framework, distribution",

author = "Huaiyu Zhu and Richard Rohwer",

year = "1995",

month = aug,

day = "31",

language = "English",

isbn = "NCRG/4351",

publisher = "Aston University",

type = "WorkingPaper",

institution = "Aston University",

}

TY - UNPB

T1 - Bayesian invariant measurements of generalisation for discrete distributions

AU - Zhu, Huaiyu

AU - Rohwer, Richard

PY - 1995/8/31

Y1 - 1995/8/31

N2 - Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.

AB - Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.

KW - Neural network

KW - learning rules

KW - Bayesian framework

KW - distribution

M3 - Technical report

SN - NCRG/4351

BT - Bayesian invariant measurements of generalisation for discrete distributions

PB - Aston University

CY - Birmingham, UK

ER -

Bayesian invariant measurements of generalisation for discrete distributions

Abstract

Keywords

Access to Document

Fingerprint

Cite this