Sequential, Bayesian geostatistics: a principled method for large data sets

Dan Cornford*, Lehel Csató, Manfred Opper

*Corresponding author for this work

Research output: Contribution to journalArticle

Abstract

The principled statistical application of Gaussian random field models used in geostatistics has historically been limited to data sets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations, or to use likelihood-based covariance estimation. Various ad hoc approaches to solve this problem have been adopted, such as selecting a neighborhood region and/or a small number of observations to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this article, we present a Bayesian method for estimating the posterior mean and covariance structures of a Gaussian random field using a sequential estimation algorithm. By imposing sparsity in a well-defined framework, the algorithm retains a subset of “basis vectors” that best represent the “true” posterior Gaussian random field model in the relative entropy sense. This allows a principled treatment of Gaussian random field models on very large data sets. The method is particularly appropriate when the Gaussian random field model is regarded as a latent variable model, which may be nonlinearly related to the observations. We show the application of the sequential, sparse Bayesian estimation in Gaussian random field models and discuss its merits and drawbacks.
Original languageEnglish
Pages (from-to)183-199
Number of pages17
JournalGeographical Analysis
Volume37
Issue number2
Early online date18 Mar 2005
DOIs
Publication statusPublished - Apr 2005

Fingerprint

geostatistics
statistical application
entropy
kriging
method
matrix

Bibliographical note

The definitive version is available at www.onlinelibrary.wiley.com. Permissions sought by third parties must be acquired from Wiley-Blackwell

Keywords

  • Gaussian random field models
  • geostatistics
  • predictive distribution
  • covariance estimation
  • sequential estimation algorithm

Cite this

Cornford, Dan ; Csató, Lehel ; Opper, Manfred. / Sequential, Bayesian geostatistics : a principled method for large data sets. In: Geographical Analysis. 2005 ; Vol. 37, No. 2. pp. 183-199.
@article{f2f2fbe2e1184b05aa47d004860f83c4,
title = "Sequential, Bayesian geostatistics: a principled method for large data sets",
abstract = "The principled statistical application of Gaussian random field models used in geostatistics has historically been limited to data sets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations, or to use likelihood-based covariance estimation. Various ad hoc approaches to solve this problem have been adopted, such as selecting a neighborhood region and/or a small number of observations to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this article, we present a Bayesian method for estimating the posterior mean and covariance structures of a Gaussian random field using a sequential estimation algorithm. By imposing sparsity in a well-defined framework, the algorithm retains a subset of “basis vectors” that best represent the “true” posterior Gaussian random field model in the relative entropy sense. This allows a principled treatment of Gaussian random field models on very large data sets. The method is particularly appropriate when the Gaussian random field model is regarded as a latent variable model, which may be nonlinearly related to the observations. We show the application of the sequential, sparse Bayesian estimation in Gaussian random field models and discuss its merits and drawbacks.",
keywords = "Gaussian random field models, geostatistics, predictive distribution, covariance estimation, sequential estimation algorithm",
author = "Dan Cornford and Lehel Csat{\'o} and Manfred Opper",
note = "The definitive version is available at www.onlinelibrary.wiley.com. Permissions sought by third parties must be acquired from Wiley-Blackwell",
year = "2005",
month = "4",
doi = "10.1111/j.1538-4632.2005.00635.x",
language = "English",
volume = "37",
pages = "183--199",
journal = "Geographical Analysis",
issn = "0016-7363",
publisher = "Wiley-Blackwell",
number = "2",

}

Sequential, Bayesian geostatistics : a principled method for large data sets. / Cornford, Dan; Csató, Lehel; Opper, Manfred.

In: Geographical Analysis, Vol. 37, No. 2, 04.2005, p. 183-199.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Sequential, Bayesian geostatistics

T2 - a principled method for large data sets

AU - Cornford, Dan

AU - Csató, Lehel

AU - Opper, Manfred

N1 - The definitive version is available at www.onlinelibrary.wiley.com. Permissions sought by third parties must be acquired from Wiley-Blackwell

PY - 2005/4

Y1 - 2005/4

N2 - The principled statistical application of Gaussian random field models used in geostatistics has historically been limited to data sets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations, or to use likelihood-based covariance estimation. Various ad hoc approaches to solve this problem have been adopted, such as selecting a neighborhood region and/or a small number of observations to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this article, we present a Bayesian method for estimating the posterior mean and covariance structures of a Gaussian random field using a sequential estimation algorithm. By imposing sparsity in a well-defined framework, the algorithm retains a subset of “basis vectors” that best represent the “true” posterior Gaussian random field model in the relative entropy sense. This allows a principled treatment of Gaussian random field models on very large data sets. The method is particularly appropriate when the Gaussian random field model is regarded as a latent variable model, which may be nonlinearly related to the observations. We show the application of the sequential, sparse Bayesian estimation in Gaussian random field models and discuss its merits and drawbacks.

AB - The principled statistical application of Gaussian random field models used in geostatistics has historically been limited to data sets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations, or to use likelihood-based covariance estimation. Various ad hoc approaches to solve this problem have been adopted, such as selecting a neighborhood region and/or a small number of observations to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this article, we present a Bayesian method for estimating the posterior mean and covariance structures of a Gaussian random field using a sequential estimation algorithm. By imposing sparsity in a well-defined framework, the algorithm retains a subset of “basis vectors” that best represent the “true” posterior Gaussian random field model in the relative entropy sense. This allows a principled treatment of Gaussian random field models on very large data sets. The method is particularly appropriate when the Gaussian random field model is regarded as a latent variable model, which may be nonlinearly related to the observations. We show the application of the sequential, sparse Bayesian estimation in Gaussian random field models and discuss its merits and drawbacks.

KW - Gaussian random field models

KW - geostatistics

KW - predictive distribution

KW - covariance estimation

KW - sequential estimation algorithm

UR - http://www.scopus.com/inward/record.url?scp=17344370420&partnerID=8YFLogxK

UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1538-4632.2005.00635.x/abstract

U2 - 10.1111/j.1538-4632.2005.00635.x

DO - 10.1111/j.1538-4632.2005.00635.x

M3 - Article

VL - 37

SP - 183

EP - 199

JO - Geographical Analysis

JF - Geographical Analysis

SN - 0016-7363

IS - 2

ER -