Regularisation of mixture density networks

Lars U. Hjorth; Ian T. Nabney

doi:10.1049/cp:19991162

Regularisation of mixture density networks

Lars U. Hjorth, Ian T. Nabney

Computer Science Research Group

Research output: Chapter in Book/Published conference output › Conference publication

Abstract

Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.

Original language	English
Title of host publication	Ninth International Conference on Artificial Neural Networks, 1999
Subtitle of host publication	ICANN 99
Publisher	IET
Pages	521-526
Number of pages	6
Volume	2
ISBN (Print)	0-85296-721-7
DOIs	https://doi.org/10.1049/cp:19991162
Publication status	Published - 1999
Event	9th International Conference on Artificial Neural Networks - Edinburgh, United Kingdom Duration: 7 Sept 1999 → 7 Sept 1999

Publication series

Name	IET conference publications
Publisher	IET
Number	470
ISSN (Print)	0537-9989

Conference

Conference	9th International Conference on Artificial Neural Networks
Abbreviated title	ICANN 99
Country/Territory	United Kingdom
City	Edinburgh
Period	7/09/99 → 7/09/99

Bibliographical note

This paper is a postprint of a paper submitted to and accepted for publication in IET conference publications and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at IET Digital Library.

Keywords

NCRG
neural nets
Bayesian regularisation
maximum likelihood estimation
mixture density networks
multivalued functions
neural networks
probability

Access to Document

10.1049/cp:19991162

Regularisation of mixture density networks
This paper is a postprint of a paper submitted to and accepted for publication in IET conference publications and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at IET Digital Library.
Final published version, 496 KB

Cite this

@inproceedings{fd962d145f4347adbcbde6b3e1bc13f5,

title = "Regularisation of mixture density networks",

abstract = "Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.",

keywords = "NCRG, neural nets, Bayesian regularisation, maximum likelihood estimation, mixture density networks, multivalued functions, neural networks, probability",

author = "Hjorth, {Lars U.} and Nabney, {Ian T.}",

note = "This paper is a postprint of a paper submitted to and accepted for publication in IET conference publications and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at IET Digital Library.; 9th International Conference on Artificial Neural Networks, ICANN 99 ; Conference date: 07-09-1999 Through 07-09-1999",

year = "1999",

doi = "10.1049/cp:19991162",

language = "English",

isbn = "0-85296-721-7",

volume = "2",

series = "IET conference publications",

publisher = "IET",

number = "470",

pages = "521--526",

booktitle = "Ninth International Conference on Artificial Neural Networks, 1999",

address = "United Kingdom",

}

TY - GEN

T1 - Regularisation of mixture density networks

AU - Hjorth, Lars U.

AU - Nabney, Ian T.

N1 - This paper is a postprint of a paper submitted to and accepted for publication in IET conference publications and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at IET Digital Library.

PY - 1999

Y1 - 1999

N2 - Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.

AB - Mixture Density Networks are a principled method to model conditional probability density functions which are non-Gaussian. This is achieved by modelling the conditional distribution for each pattern with a Gaussian Mixture Model for which the parameters are generated by a neural network. This thesis presents a novel method to introduce regularisation in this context for the special case where the mean and variance of the spherical Gaussian Kernels in the mixtures are fixed to predetermined values. Guidelines for how these parameters can be initialised are given, and it is shown how to apply the evidence framework to mixture density networks to achieve regularisation. This also provides an objective stopping criteria that can replace the `early stopping' methods that have previously been used. If the neural network used is an RBF network with fixed centres this opens up new opportunities for improved initialisation of the network weights, which are exploited to start training relatively close to the optimum. The new method is demonstrated on two data sets. The first is a simple synthetic data set while the second is a real life data set, namely satellite scatterometer data used to infer the wind speed and wind direction near the ocean surface. For both data sets the regularisation method performs well in comparison with earlier published results. Ideas on how the constraint on the kernels may be relaxed to allow fully adaptable kernels are presented.

KW - NCRG

KW - neural nets

KW - Bayesian regularisation

KW - maximum likelihood estimation

KW - mixture density networks

KW - multivalued functions

KW - neural networks

KW - probability

UR - http://digital-library.theiet.org/content/conferences/10.1049/cp_19991162

UR - http://www.scopus.com/inward/record.url?scp=0033355853&partnerID=8YFLogxK

U2 - 10.1049/cp:19991162

DO - 10.1049/cp:19991162

M3 - Conference publication

SN - 0-85296-721-7

VL - 2

T3 - IET conference publications

SP - 521

EP - 526

BT - Ninth International Conference on Artificial Neural Networks, 1999

PB - IET

T2 - 9th International Conference on Artificial Neural Networks

Y2 - 7 September 1999 through 7 September 1999

ER -

Regularisation of mixture density networks

Abstract

Publication series

Conference

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this