Data visualisation with missing data: A non-linear approach

Martin Schroeder, Dan Cornford

Research output: Working paperTechnical report

Abstract

Exploratory analysis of data in all sciences seeks to find common patterns to gain insights into the structure and distribution of the data. Typically visualisation methods like principal components analysis are used but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this technical report we discuss a complementary approach based on a non-linear probabilistic model. The generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate far more structure than a two dimensional principal components plot could, and deal at the same time with missing data. We show that using the generative topographic mapping provides us with an optimal method to explore the data while being able to replace missing values in a dataset, particularly where a large proportion of the data is missing.
Original languageEnglish
Place of PublicationBirmingham
PublisherAston University
Number of pages26
ISBN (Print)NCRG/2007/04
Publication statusPublished - 12 Oct 2007

Fingerprint

Data visualization
Visualization
Principal component analysis
Statistical Models

Keywords

  • multivariate statistics
  • generative topographic mapping deals with missing data

Cite this

Schroeder, M., & Cornford, D. (2007). Data visualisation with missing data: A non-linear approach. Birmingham: Aston University.
Schroeder, Martin ; Cornford, Dan. / Data visualisation with missing data: A non-linear approach. Birmingham : Aston University, 2007.
@techreport{472ed2d81c474f2a90d5963df396ead1,
title = "Data visualisation with missing data: A non-linear approach",
abstract = "Exploratory analysis of data in all sciences seeks to find common patterns to gain insights into the structure and distribution of the data. Typically visualisation methods like principal components analysis are used but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this technical report we discuss a complementary approach based on a non-linear probabilistic model. The generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate far more structure than a two dimensional principal components plot could, and deal at the same time with missing data. We show that using the generative topographic mapping provides us with an optimal method to explore the data while being able to replace missing values in a dataset, particularly where a large proportion of the data is missing.",
keywords = "multivariate statistics, generative topographic mapping deals with missing data",
author = "Martin Schroeder and Dan Cornford",
year = "2007",
month = "10",
day = "12",
language = "English",
isbn = "NCRG/2007/04",
publisher = "Aston University",
type = "WorkingPaper",
institution = "Aston University",

}

Schroeder, M & Cornford, D 2007 'Data visualisation with missing data: A non-linear approach' Aston University, Birmingham.

Data visualisation with missing data: A non-linear approach. / Schroeder, Martin; Cornford, Dan.

Birmingham : Aston University, 2007.

Research output: Working paperTechnical report

TY - UNPB

T1 - Data visualisation with missing data: A non-linear approach

AU - Schroeder, Martin

AU - Cornford, Dan

PY - 2007/10/12

Y1 - 2007/10/12

N2 - Exploratory analysis of data in all sciences seeks to find common patterns to gain insights into the structure and distribution of the data. Typically visualisation methods like principal components analysis are used but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this technical report we discuss a complementary approach based on a non-linear probabilistic model. The generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate far more structure than a two dimensional principal components plot could, and deal at the same time with missing data. We show that using the generative topographic mapping provides us with an optimal method to explore the data while being able to replace missing values in a dataset, particularly where a large proportion of the data is missing.

AB - Exploratory analysis of data in all sciences seeks to find common patterns to gain insights into the structure and distribution of the data. Typically visualisation methods like principal components analysis are used but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this technical report we discuss a complementary approach based on a non-linear probabilistic model. The generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate far more structure than a two dimensional principal components plot could, and deal at the same time with missing data. We show that using the generative topographic mapping provides us with an optimal method to explore the data while being able to replace missing values in a dataset, particularly where a large proportion of the data is missing.

KW - multivariate statistics

KW - generative topographic mapping deals with missing data

M3 - Technical report

SN - NCRG/2007/04

BT - Data visualisation with missing data: A non-linear approach

PB - Aston University

CY - Birmingham

ER -

Schroeder M, Cornford D. Data visualisation with missing data: A non-linear approach. Birmingham: Aston University. 2007 Oct 12.