Using secondary knowledge to support decision tree classification of retrospective clinical data

Dympna O'Sullivan, William Elazmeh, Szymon Wilk, Ken Farion, Stan Matwin, Wojtek Michalowski, Moravid Sehatkar

Research output: Chapter in Book/Published conference outputOther chapter contribution


Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach.
Original languageEnglish
Title of host publicationMining complex data: ECML/PKDD 2007 third international workshop, MCD 2007, Warsaw, Poland, September 17-21, 2007
Subtitle of host publicationrevised selected papers
EditorsZbigniew W. Ras, Shusaku Tsumoto, Djamel Zighed
Place of PublicationBerlin (DE)
Number of pages14
ISBN (Print)3-540-68415-8, 978-3-540-68415-2
Publication statusPublished - 2008
EventECML/PKDD 2007 Third International Workshop (MCD 2007) - Warsaw, Poland
Duration: 17 Sept 200821 Sept 2008

Publication series

NameLecture notes in computer science
ISSN (Print)0302-9743


WorkshopECML/PKDD 2007 Third International Workshop (MCD 2007)
Internet address


Dive into the research topics of 'Using secondary knowledge to support decision tree classification of retrospective clinical data'. Together they form a unique fingerprint.

Cite this