Here at Black Oak Analytics, we enjoy nothing more than a good discussion on the configurability of our HiPER software, and how it can match data without traditional attributes such as name and address. We enjoy talking about this because so many entity resolution engines and master data management solutions are restricted to matching on traditional attributes. Without name and address or a unique identifier like a customer number, bank account number, or Social Security number, many entity resolution engines cannot accurately match data.
HiPER’s (High Performance Entity Resolution System) data integration techniques, on the other hand, match based upon your business use case, and as markets change and business goals shift, so can your matching strategy. Many people think that false positive and false negative matches in their data aren’t significant, which might have been true to an extent a couple of decades ago when only dealing with a few thousand records. However, today is the era of Big Data, when data is growing exponentially, not additively, every year.
As data volumes grow and organizations add more data sources to their internal IT processes, entity resolution on Big Data becomes more and more difficult for organizations to handle on their own, and most prepackaged master data management and entity resolution software systems have restrictive matching strategies. These are the reasons why we enjoy a good discussion about HiPER’s configurable matching. Regardless of the type of data you need matched, HiPER can do the trick as recently demonstrated by one of our data science interns.
Data Integration on Unstructured Big Data Sets
During our springtime update last spring we announced that we had been chosen by the South Big Data Regional Innovation Hub (sponsored by the National Science Foundation) as a host company for their Southern Startup Internship Program in Data Science.
“The program is designed to facilitate connections between graduate students in data science related fields and entrepreneurial firms in the South, helping to strengthen academic-industrial connections and enhancing the entrepreneurial culture in the South,” according to the Southern Startup Internship Program.
Programs like these help bridge the gap between academic study and practical industry application. Aziz Eram was chosen to be the recipient of the internship, and she worked on matching Big Data over the summer. Guess what she worked on? Eram used HiPER to build entities using unstructured, non-PII fields from large volume data sets.
Under the guidance of Chief Science Officer Dr. John R. Talburt and supervised by Director of Analytics Steve Sample, Eram was originally given two main data sets: lender name data sourced public information, and credit card transaction data from a major U.S. credit card issuer. Using specialized match comparators, she worked to create entities from large volumes of unstructured data.
Eram presented her work at the end of September for the Computing Community Consortium. Here at Black Oak Analytics, we are extremely proud of Eram’s tireless efforts because she her work only demonstrates the intersection of academic and industry research but also shows how HiPER data integration can bring ROI to your business. Click here for the full article on Eram’s work.
Black Oak, and its HiPER software, was recently named one of the Top 100 Most Promising Big Data Solution Providers. If you or your company would like to discuss master data management or HiPER entity resolution, contact us today!