Partial Entity Resolution

The other day I had a conversation about product master data, and one of the participants, almost as an aside, mentioned a concept of a “virtual product.” More specifically, he was referring to an operational context in which a maintenance team needed to look for a type of a part to be used to replace a existing worn machine part. The curious aspect of this was that they were not looking for a specific part. Rather, they needed to describe the characteristics of the part and then see which available parts match those characteristics. If none were available, they’d either need to create a new one or search other suppliers for a matching part.

Read more

Variant Approaches to Identity Resolution and Record Matching

January 18, 2011 by · 1 Comment
Filed under: Data Analysis, Identity Resolution 

I had a set of discussions recently from representatives of different business functions and found an interesting phenomenon: although folks from almost every area of the business indicated a need for some degree of identity resolution and matching, there were different requirements, expectations, processes, and even tools/techniques in place. In some cases it seems that the matching algorithms each uses refers to different data elements, uses different scoring weights, different thresholds, and different processes for manual review of questionable matches. Altogether the result is inconsistency in matching precision.

And it is reasonable for different business functions to have different levels of precision for matching. You don’t need as strict a set of scoring thresholds for matching individuals for the purpose of marketing as you might for assuring customer privacy. But when different tools and methods are used, there is bound to be duplicative work in implementing and managing the different matching processes and rules.

To address this, it might be worth considering whether the existing approaches serve the organization in the most appropriate way. This involves performing at least these steps:

1) Document the current state of matching/identity resolution
2) Profile the data sets to determine the best data attributes for matching
3) Document each business process’s matching requirements
4) Evaluate the existing solutions and determine that the current situation is acceptable or that there is an opportunity to select one specific approach that can be used as a standard across the organization