Data visualization with simultaneous feature selection

Dharmesh M. Maniyar, Ian T. Nabney

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE.

Original languageEnglish
Title of host publicationProceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06
Pages156-163
Number of pages8
DOIs
Publication statusPublished - 2006
Event3rd symposium on Computational Intelligence in Bioinformatics and Computational Biology - Toronto, ON, Canada
Duration: 28 Sep 200629 Sep 2006

Symposium

Symposium3rd symposium on Computational Intelligence in Bioinformatics and Computational Biology
Abbreviated titleCIBCB '06
CountryCanada
CityToronto, ON
Period28/09/0629/09/06

Fingerprint

Data visualization
Data Visualization
Feature Selection
Saliency
Feature extraction
Projection
Self organizing maps
Bioinformatics
Feature Modeling
Self-organizing Map
Visualization
Model
Distinct
Estimate
Demonstrate
Training

Bibliographical note

© 2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Keywords

  • chemoinformatics
  • data mining
  • data visualization
  • feature selection
  • generative topographic mapping
  • unsupervised learning

Cite this

Maniyar, D. M., & Nabney, I. T. (2006). Data visualization with simultaneous feature selection. In Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06 (pp. 156-163) https://doi.org/10.1109/CIBCB.2006.330985
Maniyar, Dharmesh M. ; Nabney, Ian T. / Data visualization with simultaneous feature selection. Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06. 2006. pp. 156-163
@inproceedings{9263173bd15d410fa238d404884f3090,
title = "Data visualization with simultaneous feature selection",
abstract = "Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. {\circledC} 2006 IEEE.",
keywords = "chemoinformatics, data mining, data visualization, feature selection, generative topographic mapping, unsupervised learning",
author = "Maniyar, {Dharmesh M.} and Nabney, {Ian T.}",
note = "{\circledC} 2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.",
year = "2006",
doi = "10.1109/CIBCB.2006.330985",
language = "English",
isbn = "1424406234",
pages = "156--163",
booktitle = "Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06",

}

Maniyar, DM & Nabney, IT 2006, Data visualization with simultaneous feature selection. in Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06. pp. 156-163, 3rd symposium on Computational Intelligence in Bioinformatics and Computational Biology, Toronto, ON, Canada, 28/09/06. https://doi.org/10.1109/CIBCB.2006.330985

Data visualization with simultaneous feature selection. / Maniyar, Dharmesh M.; Nabney, Ian T.

Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06. 2006. p. 156-163.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Data visualization with simultaneous feature selection

AU - Maniyar, Dharmesh M.

AU - Nabney, Ian T.

N1 - © 2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

PY - 2006

Y1 - 2006

N2 - Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE.

AB - Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE.

KW - chemoinformatics

KW - data mining

KW - data visualization

KW - feature selection

KW - generative topographic mapping

KW - unsupervised learning

UR - http://www.scopus.com/inward/record.url?scp=50249123337&partnerID=8YFLogxK

U2 - 10.1109/CIBCB.2006.330985

DO - 10.1109/CIBCB.2006.330985

M3 - Conference contribution

SN - 1424406234

SN - 9781424406234

SP - 156

EP - 163

BT - Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06

ER -

Maniyar DM, Nabney IT. Data visualization with simultaneous feature selection. In Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'06. 2006. p. 156-163 https://doi.org/10.1109/CIBCB.2006.330985