Linking cohort-based data with electronic health records: a proof-of-concept methodological study in Hong Kong

Le Gao, Miriam T. Y. Leung, Xue Li, Celine S. L. Chui, Rosa S. M. Wong, Shiu Lun Au Yeung, Edward W. W. Chan, Adrienne Y. L. Chan, Esther W. Chan, Wilfred H. S. Wong, Tatia M. C. Lee, Nirmala Rao, Yun Kwok Wing, Terry Y. S. Lum, Gabriel M. Leung, Patrick Ip, Ian C. K. Wong

Research output: Contribution to journalArticlepeer-review

Abstract

Objectives: Data linkage of cohort-based data and electronic health records (EHRs) has been practised in many countries, but in Hong Kong there is still a lack of such research. To expand the use of multisource data, we aimed to identify a feasible way of linking two cohorts with EHRs in Hong Kong.

Methods: Participants in the ‘Children of 1997’ birth cohort and the Chinese Early Development Instrument (CEDI) cohort were separated into several batches. The
Hong Kong Identity Card Numbers (HKIDs) of each batch were then uploaded to the Hong Kong Clinical Data Analysis and Reporting System (CDARS) to retrieve EHRs.
Within the same batch, each participant has a unique combination of date of birth and sex which can then be used for exact matching, as no HKID will be returned
from CDARS. Raw data collected for the two cohorts were checked for the mismatched cases. After the matching, we conducted a simple descriptive analysis of attention deficit
hyperactivity disorder (ADHD) information collected in the CEDI cohort via the Strengths and Weaknesses of ADHD Symptoms and Normal Behaviour Scale (SWAN) and EHRs.

Results: In total, 3473 and 910 HKIDs in the birth cohort and CEDI cohort were separated into 44 and 5 batches, respectively, and then submitted to the CDARS, with 100%
and 97% being valid HKIDs respectively. The match rates were confirmed to be 100% and 99.75% after checking the cohort data. From our illustration using the ADHD
information in the CEDI cohort, 36 (4.47%) individuals had ADHD–Combined score over the clinical cut-off in the SWAN survey, and 68 (8.31%) individuals had ADHD
records in EHRs.

Conclusions: Using date of birth and sex as identifiable variables, we were able to link the cohort data and EHRs with high match rates. This method will assist in the generation of databases for future multidisciplinary research using both cohort data and EHRs.
Original languageEnglish
Pages (from-to)e045868
Number of pages7
JournalBMJ Open
Volume11
Issue number6
Early online date22 Jun 2021
DOIs
Publication statusPublished - 15 Feb 2022

Fingerprint

Dive into the research topics of 'Linking cohort-based data with electronic health records: a proof-of-concept methodological study in Hong Kong'. Together they form a unique fingerprint.

Cite this