Risk factors and prediction of very short term versus short/intermediate term post-stroke mortality: A data mining approach

Jonathan F. Easton, Christopher R. Stephens, Maia Angelova*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

37 Citations (Scopus)

Abstract

Data mining and knowledge discovery as an approach to examining medical data can limit some of the inherent bias in the hypothesis assumptions that can be found in traditional clinical data analysis. In this paper we illustrate the benefits of a data mining inspired approach to statistically analysing a bespoke data set, the academic multicentre randomised control trial, UK Glucose Insulin in Stroke Trial (GIST-UK), with a view to discovering new insights distinct from the original hypotheses of the trial. We consider post-stroke mortality prediction as a function of days since stroke onset, showing that the time scales that best characterise changes in mortality risk are most naturally defined by examination of the mortality curve. We show that certain risk factors differentiate between very short term and intermediate term mortality. In particular, we show that age is highly relevant for intermediate term risk but not for very short or short term mortality. We suggest that this is due to the concept of frailty. Other risk factors are highlighted across a range of variable types including socio-demographics, past medical histories and admission medication. Using the most statistically significant risk factors we build predictive classification models for very short term and short/intermediate term mortality.

Original languageEnglish
Pages (from-to)199-210
Number of pages12
JournalComputers in Biology and Medicine
Volume54
DOIs
Publication statusPublished - 1 Nov 2014

Bibliographical note

Publisher Copyright:
© 2014 Published by Elsevier Ltd.

Funding

We thank Prof. C.S. Gray and Prof. A.J. Hildreth for their help and useful conversations regarding the data set. J.E. and M.A. thank the hospitality of UNAM and C.S. thanks the hospitality of Northumbria University. J.E. thanks Dr. L. Zhang for recommendations regarding decision tree analysis. This work was supported by The MATSIQEL project, European FP7 Research Grant: FP7-PEOPLE-2009-IRSES-247541 , CONYACyT through its Redes Tematicas program and DGAPA Grant: IN113414. We thank the reviewers for their useful comments to improve this paper.

FundersFunder number
CONYACyT
DGAPA-PAPIITIN113414

    Keywords

    • Data mining
    • Medical relevance
    • Mortality
    • Naïve Bayes analysis
    • Prediction
    • Risk factors
    • Stroke

    Fingerprint

    Dive into the research topics of 'Risk factors and prediction of very short term versus short/intermediate term post-stroke mortality: A data mining approach'. Together they form a unique fingerprint.

    Cite this