Using secondary knowledge to support decision tree classification of retrospective clinical data

Dympna O'Sullivan, William Elazmeh, Szymon Wilk, Ken Farion, Stan Matwin, Wojtek Michalowski, Moravid Sehatkar

Research output: Chapter in Book/Report/Conference proceedingOther chapter contribution

Abstract

Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach.
Original languageEnglish
Title of host publicationMining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007
Subtitle of host publicationrevised selected papers
EditorsZbigniew W. Ras, Shusaku Tsumoto, Djamel Zighed
Place of PublicationBerlin (DE)
PublisherSpringer
Pages238-251
Number of pages14
ISBN (Print)3-540-68415-8, 978-3-540-68415-2
DOIs
Publication statusPublished - 2008
EventECML/PKDD 2007 Third International Workshop (MCD 2007) - Warsaw, Poland
Duration: 17 Sep 200821 Sep 2008
http://eric.univ-lyon2.fr/~mcd/2007/eric.univ-lyon2.fr/_mcd/2007/index6b4e.html?page=default.php

Publication series

NameLecture notes in computer science
PublisherSpringer
Volume4944
ISSN (Print)0302-9743

Workshop

WorkshopECML/PKDD 2007 Third International Workshop (MCD 2007)
CountryPoland
CityWarsaw
Period17/09/0821/09/08
Internet address

Fingerprint

Decision trees
Data mining
Pediatrics
Transcription
Learning systems

Cite this

O'Sullivan, D., Elazmeh, W., Wilk, S., Farion, K., Matwin, S., Michalowski, W., & Sehatkar, M. (2008). Using secondary knowledge to support decision tree classification of retrospective clinical data. In Z. W. Ras, S. Tsumoto, & D. Zighed (Eds.), Mining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007: revised selected papers (pp. 238-251). (Lecture notes in computer science ; Vol. 4944). Berlin (DE): Springer. https://doi.org/10.1007/978-3-540-68416-9_19
O'Sullivan, Dympna ; Elazmeh, William ; Wilk, Szymon ; Farion, Ken ; Matwin, Stan ; Michalowski, Wojtek ; Sehatkar, Moravid. / Using secondary knowledge to support decision tree classification of retrospective clinical data. Mining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007: revised selected papers. editor / Zbigniew W. Ras ; Shusaku Tsumoto ; Djamel Zighed. Berlin (DE) : Springer, 2008. pp. 238-251 (Lecture notes in computer science ).
@inbook{24a40bb47dfd4f70ab7b6cf2e8c59eaa,
title = "Using secondary knowledge to support decision tree classification of retrospective clinical data",
abstract = "Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach.",
author = "Dympna O'Sullivan and William Elazmeh and Szymon Wilk and Ken Farion and Stan Matwin and Wojtek Michalowski and Moravid Sehatkar",
year = "2008",
doi = "10.1007/978-3-540-68416-9_19",
language = "English",
isbn = "3-540-68415-8",
series = "Lecture notes in computer science",
publisher = "Springer",
pages = "238--251",
editor = "Ras, {Zbigniew W.} and Shusaku Tsumoto and Djamel Zighed",
booktitle = "Mining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007",
address = "Germany",

}

O'Sullivan, D, Elazmeh, W, Wilk, S, Farion, K, Matwin, S, Michalowski, W & Sehatkar, M 2008, Using secondary knowledge to support decision tree classification of retrospective clinical data. in ZW Ras, S Tsumoto & D Zighed (eds), Mining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007: revised selected papers. Lecture notes in computer science , vol. 4944, Springer, Berlin (DE), pp. 238-251, ECML/PKDD 2007 Third International Workshop (MCD 2007), Warsaw, Poland, 17/09/08. https://doi.org/10.1007/978-3-540-68416-9_19

Using secondary knowledge to support decision tree classification of retrospective clinical data. / O'Sullivan, Dympna; Elazmeh, William; Wilk, Szymon; Farion, Ken; Matwin, Stan; Michalowski, Wojtek ; Sehatkar, Moravid.

Mining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007: revised selected papers. ed. / Zbigniew W. Ras; Shusaku Tsumoto; Djamel Zighed. Berlin (DE) : Springer, 2008. p. 238-251 (Lecture notes in computer science ; Vol. 4944).

Research output: Chapter in Book/Report/Conference proceedingOther chapter contribution

TY - CHAP

T1 - Using secondary knowledge to support decision tree classification of retrospective clinical data

AU - O'Sullivan, Dympna

AU - Elazmeh, William

AU - Wilk, Szymon

AU - Farion, Ken

AU - Matwin, Stan

AU - Michalowski, Wojtek

AU - Sehatkar, Moravid

PY - 2008

Y1 - 2008

N2 - Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach.

AB - Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach.

UR - http://www.scopus.com/inward/record.url?scp=44649141930&partnerID=8YFLogxK

UR - http://www.springerlink.com/content/4464154112l4x75h

U2 - 10.1007/978-3-540-68416-9_19

DO - 10.1007/978-3-540-68416-9_19

M3 - Other chapter contribution

AN - SCOPUS:44649141930

SN - 3-540-68415-8

SN - 978-3-540-68415-2

T3 - Lecture notes in computer science

SP - 238

EP - 251

BT - Mining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007

A2 - Ras, Zbigniew W.

A2 - Tsumoto, Shusaku

A2 - Zighed, Djamel

PB - Springer

CY - Berlin (DE)

ER -

O'Sullivan D, Elazmeh W, Wilk S, Farion K, Matwin S, Michalowski W et al. Using secondary knowledge to support decision tree classification of retrospective clinical data. In Ras ZW, Tsumoto S, Zighed D, editors, Mining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007: revised selected papers. Berlin (DE): Springer. 2008. p. 238-251. (Lecture notes in computer science ). https://doi.org/10.1007/978-3-540-68416-9_19