The Australian National University

Welcome to the ANU Data Mining and Matching group.

We conduct research in a diverse range of topics, focusing on the broad area of data integration and especially record linkage (data matching or entity resolution).

Projects we are working on:

  1.  Practical aspects of entity resolution
  2.  Privacy-preserving record linkage of multiple database
  3.  Attack methods for multi-database privacy-preserving record linkage
  4.  Dynamic and temporal record linkage
  5.  Linking historical census and registry data
  6.  ...

A new discussion forum on data linkage DLforum, and our demos GeCo and MERLIN are online!
Give them a try!

Our group:

  • Peter Christen works on different aspects of record linkage, including privacy-preserving record linkage, linking temporal and dynamic data, and linking historical birth, death, marriage and census records.
  • He is also interested in active learning for record linkage, and how to properly evaluate record linkage quality.

  • Qing Wang works on data linkage that focuses in answering two questions: (a) how can data from different sources be linked in a meaningful way; and (b) how can data linkage be used for providing useful knowledge? 
  • She is working on the following projects of data linkage: (1) studying provenance-aware entity resolution which can leverage provenance information to improve the reliability of data linkage; (2) building dynamic repairing techniques for improving entity resolution classifiers over time; (3) developing knowledge-based representation, learning and reasoning techniques for data linkage.

  • Dinusha Vatsalan is working on privacy-preserving record linkage (PPRL) of multiple databases and dynamic and temporal PPRL as part of Australian Research Council (ARC) Discovery Projects DP130101801 and DP160101934.
  • The aim of her research is to develop efficient and scalable algorithms for linking dynamic and temporal databases from multiple parties (organizations) that achieve high linkage quality in the presence of data errors and provide sufficient privacy guarantees such that no sensitive information about the entities that can be used to infer their private data is revealed among the parties or to any external party.

  • Jeffrey Fisher is working on collective entity resolution techniques in domains such as group linkage, population reconstruction and temporal entity resolution.
  • The focus of his work is on finding solutions to problems which limit the use of many collective entity resolution techniques in practice, including poor scalability, difficulty of evaluation, and lack of appropriate training data.

  • Thilina Ranbaduge is working on privacy-preserving record linkage techniques (PPRL) for multiple parties. His main focus is to develop scalable techniques for efficient and effective indexing/blocking in PPRL for multiple parties.

