Bayesian invariant measurements of generalisation for discrete distributions

Huaiyu Zhu, Richard Rohwer

    Research output: Working paperTechnical report

    Abstract

    Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.
    Original languageEnglish
    Place of PublicationBirmingham, UK
    PublisherAston University
    Number of pages23
    ISBN (Print)NCRG/4351
    Publication statusUnpublished - 31 Aug 1995

    Fingerprint

    divergence
    estimators
    learning

    Keywords

    • Neural network
    • learning rules
    • Bayesian framework
    • distribution

    Cite this

    Zhu, H., & Rohwer, R. (1995). Bayesian invariant measurements of generalisation for discrete distributions. Birmingham, UK: Aston University.
    Zhu, Huaiyu ; Rohwer, Richard. / Bayesian invariant measurements of generalisation for discrete distributions. Birmingham, UK : Aston University, 1995.
    @techreport{39c28c467c86441fab4c508ac9c1d8e6,
    title = "Bayesian invariant measurements of generalisation for discrete distributions",
    abstract = "Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.",
    keywords = "Neural network, learning rules, Bayesian framework, distribution",
    author = "Huaiyu Zhu and Richard Rohwer",
    year = "1995",
    month = "8",
    day = "31",
    language = "English",
    isbn = "NCRG/4351",
    publisher = "Aston University",
    type = "WorkingPaper",
    institution = "Aston University",

    }

    Zhu, H & Rohwer, R 1995 'Bayesian invariant measurements of generalisation for discrete distributions' Aston University, Birmingham, UK.

    Bayesian invariant measurements of generalisation for discrete distributions. / Zhu, Huaiyu; Rohwer, Richard.

    Birmingham, UK : Aston University, 1995.

    Research output: Working paperTechnical report

    TY - UNPB

    T1 - Bayesian invariant measurements of generalisation for discrete distributions

    AU - Zhu, Huaiyu

    AU - Rohwer, Richard

    PY - 1995/8/31

    Y1 - 1995/8/31

    N2 - Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.

    AB - Neural network learning rules can be viewed as statistical estimators. They should be studied in Bayesian framework even if they are not Bayesian estimators. Generalisation should be measured by the divergence between the true distribution and the estimated distribution. Information divergences are invariant measurements of the divergence between two distributions. The posterior average information divergence is used to measure the generalisation ability of a network. The optimal estimators for multinomial distributions with Dirichlet priors are studied in detail. This confirms that the definition is compatible with intuition. The results also show that many commonly used methods can be put under this unified framework, by assume special priors and special divergences.

    KW - Neural network

    KW - learning rules

    KW - Bayesian framework

    KW - distribution

    M3 - Technical report

    SN - NCRG/4351

    BT - Bayesian invariant measurements of generalisation for discrete distributions

    PB - Aston University

    CY - Birmingham, UK

    ER -

    Zhu H, Rohwer R. Bayesian invariant measurements of generalisation for discrete distributions. Birmingham, UK: Aston University. 1995 Aug 31.