Skip navigation
The Australian National University

Privacy-Preserving Similar Patient Matching

Dinusha Vatsalan and Peter Christen

ANU Research School of Computer Science

This research is funded by the Australian Research Council under Discovery Project DP130101801.

Download: PPSPM Software.zip (0.6 MB)

Abstract:

The identification of similar entities represented by records in different databases has drawn considerable attention in many application areas, including in the health domain. One important type of entity matching application that is vital for quality healthcare analytics is the identification of similar patients, known as similar patient matching. A key component of identifying similar records is the calculation of similarity of the values in attributes (fields) between these records. Due to increasing privacy and confidentiality concerns, using the actual attribute values of patient records to identify similar records across different organizations is becoming non-trivial because the attributes in such records often contain highly sensitive information such as personal and medical details of patients. Therefore, the matching needs to be based on masked (encoded) values while being effective and efficient to allow matching of large databases. Privacy-preserving similar patient matching (PPSPM) aims to address this problem by developing novel Bloom filter-based matching techniques for clinical data.


Graphical abstract:

Highlights of PPSPM:

  • Presents a framework for privacy-preserving similar patient matching (PP-SPM)
  • Novel Bloom filter-based masking and matching techniques for numerical data
  • Similar matching accuracy of Bloom filter encoded data compared to unencoded data
  • Comprehensive empirical study of the proposed framework on real datasets
  • Results show the efficacy of our framework for PP-SPM


Publications:

Our initial work on PPSPM has been published in Journal of Biomedical Informatics (JBI 2016, Volume 59, Pages 285-298) doi:10.1016/j.jbi.2015.12.004


Software:

The Python source code for the PPSPM software and test datasets are available - PPSPM Software.zip (0.6 MBytes)


Updated:  18 June 2020/Responsible Officer:  Head of School /Page Contact:  DMM Webmaster