Addressing missing data in geochemistry: a non-linear approach

Research output: Contribution to journalArticle

View graph of relations Save citation

Authors

Research units

Abstract

Exploratory analysis of petroleum geochemical data seeks to find common patterns to help distinguish between different source rocks, oils and gases, and to explain their source, maturity and any intra-reservoir alteration. However, at the outset, one is typically faced with (a) a large matrix of samples, each with a range of molecular and isotopic properties, (b) a spatially and temporally unrepresentative sampling pattern, (c) noisy data and (d) often, a large number of missing values. This inhibits analysis using conventional statistical methods. Typically, visualisation methods like principal components analysis are used, but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this paper we introduce a complementary approach based on a non-linear probabilistic model. Generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, while also dealing with missing data. We show how using generative topographic mapping also provides an optimal method with which to replace missing values in two geochemical datasets, particularly where a large proportion of the data is missing.

Request a copy

Request a copy

Details

Original languageEnglish
Pages (from-to)1162-1169
Number of pages8
JournalOrganic Geochemistry
Volume39
Issue8
DOIs
StatePublished - Aug 2008

Bibliographic note

Advances in Organic Geochemistry 2007 — Proceedings of the 23rd International Meeting on Organic Geochemistry

    Keywords

  • petroleum geochemical, range of molecular and isotopic properties, spatially and temporally unrepresentative sampling pattern, linked plots, brushing, non-linear probabilistic model, Generative topographic mapping

DOI

Employable Graduates; Exploitable Research

Copy the text from this field...