Visual data mining: integrating machine learning with information visualization

Dharmesh M. Maniyar, Ian T Nabney

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.
Original languageEnglish
Title of host publicationWorkshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”
EditorsZhongfei Zhang, Florent Masseglia, Ramesh Jain, Alberto Del Bimbo
PublisherACM
Pages143-152
Number of pages10
Publication statusPublished - 20 Aug 2006

Fingerprint

Data mining
Learning systems
Visualization
User interfaces
Interfaces (computer)
Computational complexity

Bibliographical note

© ACM, 2006. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”, 2006 http://www.fortune.binghamton.edu/MDM2006/ MDM/KDD2006 Seventh International Workshop on Multimedia Data Mining "Merging Multimedia and Data Mining Research" Held in conjunction with the KDD conference, 20 August 2006, Philadelphia (US)

Keywords

  • visual data mining
  • data visualization
  • principled projection
  • algorithms
  • information visualization techniques

Cite this

Maniyar, D. M., & Nabney, I. T. (2006). Visual data mining: integrating machine learning with information visualization. In Z. Zhang, F. Masseglia, R. Jain, & A. Del Bimbo (Eds.), Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research” (pp. 143-152). ACM.
Maniyar, Dharmesh M. ; Nabney, Ian T. / Visual data mining: integrating machine learning with information visualization. Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”. editor / Zhongfei Zhang ; Florent Masseglia ; Ramesh Jain ; Alberto Del Bimbo. ACM, 2006. pp. 143-152
@inbook{709ab382d4eb449fbc85a2c4c6b1d108,
title = "Visual data mining: integrating machine learning with information visualization",
abstract = "Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.",
keywords = "visual data mining, data visualization, principled projection, algorithms, information visualization techniques",
author = "Maniyar, {Dharmesh M.} and Nabney, {Ian T}",
note = "{\circledC} ACM, 2006. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”, 2006 http://www.fortune.binghamton.edu/MDM2006/ MDM/KDD2006 Seventh International Workshop on Multimedia Data Mining {"}Merging Multimedia and Data Mining Research{"} Held in conjunction with the KDD conference, 20 August 2006, Philadelphia (US)",
year = "2006",
month = "8",
day = "20",
language = "English",
pages = "143--152",
editor = "Zhongfei Zhang and Florent Masseglia and Ramesh Jain and {Del Bimbo}, Alberto",
booktitle = "Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”",
publisher = "ACM",
address = "United States",

}

Maniyar, DM & Nabney, IT 2006, Visual data mining: integrating machine learning with information visualization. in Z Zhang, F Masseglia, R Jain & A Del Bimbo (eds), Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”. ACM, pp. 143-152.

Visual data mining: integrating machine learning with information visualization. / Maniyar, Dharmesh M.; Nabney, Ian T.

Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”. ed. / Zhongfei Zhang; Florent Masseglia; Ramesh Jain; Alberto Del Bimbo. ACM, 2006. p. 143-152.

Research output: Chapter in Book/Report/Conference proceedingChapter

TY - CHAP

T1 - Visual data mining: integrating machine learning with information visualization

AU - Maniyar, Dharmesh M.

AU - Nabney, Ian T

N1 - © ACM, 2006. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”, 2006 http://www.fortune.binghamton.edu/MDM2006/ MDM/KDD2006 Seventh International Workshop on Multimedia Data Mining "Merging Multimedia and Data Mining Research" Held in conjunction with the KDD conference, 20 August 2006, Philadelphia (US)

PY - 2006/8/20

Y1 - 2006/8/20

N2 - Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.

AB - Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.

KW - visual data mining

KW - data visualization

KW - principled projection

KW - algorithms

KW - information visualization techniques

UR - http://www.fortune.binghamton.edu/MDM2006/

M3 - Chapter

SP - 143

EP - 152

BT - Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”

A2 - Zhang, Zhongfei

A2 - Masseglia, Florent

A2 - Jain, Ramesh

A2 - Del Bimbo, Alberto

PB - ACM

ER -

Maniyar DM, Nabney IT. Visual data mining: integrating machine learning with information visualization. In Zhang Z, Masseglia F, Jain R, Del Bimbo A, editors, Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”. ACM. 2006. p. 143-152