TY - JOUR
T1 - Differential average diversity
T2 - An efficient privacy mechanism for electronic health records
AU - Moqurrab, Syed Atif
AU - Anjum, Adeel
AU - Manzoor, Umar
AU - Nefti, Samia
AU - Ahmad, Naveed
AU - Malik, Saif Ur Rehman
PY - 2017/10
Y1 - 2017/10
N2 - Electronic Health Record (EHR) is used to measure the incremental growth of different medical conditions. The said data can also be utilized for various research purposes, such as clinical trials or epidemic control strategies. Along with the advantages, there lies a fear in publishing such data publically, as it puts the privacy of the individuals at stake. Therefore, the question that arises is "How to publish such data that is secure and useful?" After years of research, the aforesaid question is still an open issue. To achieve the best combination of privacy and utility, several privacy definitions have been proposed. Due to the sensitivity of medical data, privacy is of utmost importance. On the other hand, if we lose the utility of medical data by applying privacy approaches, then it may lead to the wrong prediction. In the said perspective, we propose a simple and computationally achievable semantic hybrid privacy definition, referred to as Range Random Sampling + Differential Average Diversity (DAD), which promises to deliver high data utility. To demonstrate the effectiveness of our proposed algorithm, we performed experimental analysis on two different datasets: (a) Hepatitis and (b) US Census Bureau. The experiments reveal that our proposed hybrid Framework achieves better utility rates while preserving the privacy of the data.
AB - Electronic Health Record (EHR) is used to measure the incremental growth of different medical conditions. The said data can also be utilized for various research purposes, such as clinical trials or epidemic control strategies. Along with the advantages, there lies a fear in publishing such data publically, as it puts the privacy of the individuals at stake. Therefore, the question that arises is "How to publish such data that is secure and useful?" After years of research, the aforesaid question is still an open issue. To achieve the best combination of privacy and utility, several privacy definitions have been proposed. Due to the sensitivity of medical data, privacy is of utmost importance. On the other hand, if we lose the utility of medical data by applying privacy approaches, then it may lead to the wrong prediction. In the said perspective, we propose a simple and computationally achievable semantic hybrid privacy definition, referred to as Range Random Sampling + Differential Average Diversity (DAD), which promises to deliver high data utility. To demonstrate the effectiveness of our proposed algorithm, we performed experimental analysis on two different datasets: (a) Hepatitis and (b) US Census Bureau. The experiments reveal that our proposed hybrid Framework achieves better utility rates while preserving the privacy of the data.
KW - Anonymity
KW - Classification
KW - Data Utility
KW - Electronic Health Record
KW - Privacy
KW - Semantic Privacy
UR - http://www.scopus.com/inward/record.url?scp=85030650830&partnerID=8YFLogxK
UR - https://www.ingentaconnect.com/content/asp/jmihi/2017/00000007/00000006/art00009;jsessionid=95s27e6t2jrgs.x-ic-live-03
U2 - 10.1166/jmihi.2017.2146
DO - 10.1166/jmihi.2017.2146
M3 - Article
AN - SCOPUS:85030650830
SN - 2156-7018
VL - 7
SP - 1177
EP - 1187
JO - Journal of Medical Imaging and Health Informatics
JF - Journal of Medical Imaging and Health Informatics
IS - 6
ER -