2019 Challenges in Patient Record Matching (and Entity Resolution)

GAO says there’s problems with patient record match quality. We’ve been saying this for a while.

Here at Black Oak Analytics, we are interested in all things related to entity and identity resolution, across a variety of industries. We recently came across a PEW article from January, 15, 2019 “GAO Highlights Need to Better Match Patient Records”. Based on a report from the Government Accountability Office (GAO), healthcare organizations need to improve matching through standardization of demographic data or by figuring out how biometric data can be used in the matching process.

We’ve been writing about the power of Big Data and matching for a while as far back as 2015, but we’ve also made sure to stay in touch with healthcare matching trends over the years as well including talking about patient care, critical components of master data management in electronic health records, measuring your match quality, issues with healthcare data quality, and more.

And we’re glad more organizations are talking about this now.

If you do enough research into patient record matching or entity resolution, one of the few isolated case studies is by Just Associates where they identified a 45% duplication rate at the Texas Children’s Medical Center. The problem with this study is that while Just Associates has updated the publication date to 2015, this study took place in 2003. Other than this isolated study, which everyone who has done research on this subject in healthcare has likely stumbled across as one point or another, it is hard to find hard numbers around this problem. After all, if healthcare organizations have these problems they represent a serious liability and risk to patients and the organizations themselves.

Data quality is a big issue in healthcare, but unfortunately talking about it publicly is also admitting that it’s true. Healthcare organizations and leaders cannot afford for patients to doubt the care they are receiving, or the data the care is based on.

After the 21st Century Cures act 2016 went into effect, the Government Accountability Office (GOA) is now doing reports specifically around the reduction of matching errors in patient records. According to various organizations, match rates in EHRs are as high as 98% but others report match rates as low as 50% between organizations that share the same EHR vendor. This might mean a doctor isn’t looking at a patient’s complete medical history or is even using someone else’s records. Which is a problem.

Proposed solutions.

PEW Charitable Trust makes some recommendations for organizations to implement these match rate improvements. Quickly, let’s address the two primary recommendations presented by PEW based on GAO’s reports from the January 2019 article:

  1. Standardize demographic data such as addresses used to match patient records.
  2. Long term, use unstructured data such as biometrics and other technologies to match data in a safe and private way.

Standardizing demographic data is a significantly more difficult task than it might sound. Healthcare organizations use data from patient input registrations to insurance company records to clinic records to all other different types of data sources. The increased efforts of states to stand up Health Information Exchanges only further increases the dirtiness of data being (supposedly) brought together. Standardizing that data without information loss requires not only computing power, but also significant manual review and labor.

There is an entire industry around consumer data, but is that industry mature enough to provide accurate data to healthcare organizations and can they afford it? The answer is probably not for both. The existing consumer data industry is designed for consumer data platforms and predominantly fuels marketing campaigns, which do not require near the level of quality or precision that healthcare does.

The second strategy also has some pitfalls beyond being a long-term solution. Using data such as biometrics is safe and is unique to each patient but are fingerprints and facial scans the inputs for the next wave of entity resolution for healthcare? The challenge is that requiring the use of biometrics such as facial recognition and finger prints has a heavy cost of implementation and universal adoption from a hardware and infrastructure standpoint.

A real, immediately available solution.

A better entity resolution solution is likely a combination of the two options. Bringing in demographic data to match on is a viable strategy, but with more match attributes than just name and address. Matching on additional medical and other support attributes allows a higher degree of match confidence in the resulting data, at the cost of increased complexity and more pairwise comparisons. Secondary match attributes often include unstructured and semi-structured data, but the resulting match groups and complexity requires increasing computational power. The problem is that pairwise comparisons (comparing two records) on Big Data is not additive but exponential.

Consider the example of comparing gender of “M” or “F” across records. Comparing 1,000 records in your organization with a matching algorithm requires a computer to make 1,000*(999)/2 = 499,500 comparisons of two fields to see if both are “M” or “F”. However, when comparing 1 million records: 1,000,000*(999,999)/2 = 49,999,950,000 pair-wise comparisons are now comparing those two values.

And doing these comparisons on semi-structured data requires patterns in the alpha-numeric strings to be compared, further increasing the complexity. That means that even organizations without Big Data are still being faced with Big Data problems.

At Black Oak Analytics, we have the HiPER system which we use to provide our clients with the highest possible match quality, for prices under traditional matching systems while guaranteeing a faster implementation than internal systems with results backed by experts in matching and entity resolution. HiPER is Big Data enabled and able to conquer complex match problems across multiple attributes including structured, semi-structured, and unstructured data sets.

To find out more about your company’s data and to have a discussion about our HiPER software and what it can do for you, contact us at Black Oak Analytics (info@blackoakanalytics.com) today.

Leave a Reply

Your email address will not be published. Required fields are marked *