A hierarchical latent variable model for data visualization

Christopher M. Bishop; Michael E. Tipping

doi:10.1109/34.667885

A hierarchical latent variable model for data visualization

Christopher M. Bishop, Michael E. Tipping

Research output: Contribution to journal › Article › peer-review

Abstract

Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

Original language	English
Pages (from-to)	281-293
Number of pages	13
Journal	IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume	20
Issue number	3
DOIs	https://doi.org/10.1109/34.667885
Publication status	Published - Mar 1998

Bibliographical note

Copyright of Institute of Electrical and Electronics Engineers (IEEE)

Keywords

Latent variables
data visualization
EM algorithm
hierarchical mixture model
density estimation
principal component analysis
factor analysis
maximum likelihood
clustering
statistics.

Access to Document

10.1109/34.667885

A hierarchical latent variable model for data visualization
Copyright of Institute of Electrical and Electronics Engineers (IEEE)
Final published version, 1.23 MB

Cite this

@article{67f144fa5cfb4181bd52657ce2461a6b,

title = "A hierarchical latent variable model for data visualization",

abstract = "Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.",

keywords = "Latent variables, data visualization, EM algorithm, hierarchical mixture model, density estimation, principal component analysis, factor analysis, maximum likelihood, clustering, statistics.",

author = "Bishop, {Christopher M.} and Tipping, {Michael E.}",

note = "Copyright of Institute of Electrical and Electronics Engineers (IEEE)",

year = "1998",

month = mar,

doi = "10.1109/34.667885",

language = "English",

volume = "20",

pages = "281--293",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE",

number = "3",

}

TY - JOUR

T1 - A hierarchical latent variable model for data visualization

AU - Bishop, Christopher M.

AU - Tipping, Michael E.

N1 - Copyright of Institute of Electrical and Electronics Engineers (IEEE)

PY - 1998/3

Y1 - 1998/3

N2 - Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

AB - Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

KW - Latent variables

KW - data visualization

KW - EM algorithm

KW - hierarchical mixture model

KW - density estimation

KW - principal component analysis

KW - factor analysis

KW - maximum likelihood

KW - clustering

KW - statistics.

UR - https://ieeexplore.ieee.org/document/667885

U2 - 10.1109/34.667885

DO - 10.1109/34.667885

M3 - Article

SN - 0162-8828

VL - 20

SP - 281

EP - 293

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

IS - 3

ER -

A hierarchical latent variable model for data visualization

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this