AI

  .  

4 min read

  .  

April 30, 2021

What's in a name: How to Create an Accurate Name Matching Engine?

Are you tired of getting bad data because someone misspells a name? We are too! Our Name Match API leverages accurate explainable AI to accurately verify the names of people and organizations against vast databases to ensure your data is correct.

Are you tired of getting bad data because someone misspells a name? We are too! Our Name Match API leverages accurate explainable AI to accurately verify the names of people and organizations against vast databases to ensure your data is correct.

Indian names, as the cliche goes, display profound unity in diversity. This diversity renders in the form of various inclusions in our names: inclusion of village name to the name of the person, appending the infamous Srimatis, Sri, Kumars to the beginning of some names, an extra Singh here and a respectful Bhai there, amongst many other variations.

With the IPL craze at its peak, here is an interesting fact for you: VVS Laxman expands to Vangipurapu Venkata Sai Laxman. Ever wondered which format of his name does the famous all-rounder uses for a bank account opening!

When names are your only unifying data point, correctly matching similar names takes on greater importance, however their variability and complexity make name matching a uniquely challenging task. Nicknames, translation errors, multiple spellings of the same name, the way the names appear on different state documents, and more all can result in missed matches.

Did you know that one of the biggest impediments in achieving 100% Aadhaar-PAN linkage is the name mismatch between the Aadhaar and PAN card (Source: here)? Also, if you haven’t already noticed, the order of appearance of the first and last name on the PAN card is reversed. All of these nuances bring a nagging complexity to the name verification process.

Existing Name-Matching Models

Let’s analyse this problem in a larger context. Variations in names of people, locations, and organizations are obscured by misspellings, aliases, nicknames, initials, and names in different languages. For instance, more often than not, these types of errors typically happen when you recite your name to a data entry operator and the data entry operator notes down a phonetically identical name. How many times have we come across names like “Nutan and Nootan, Bharti and Bhartee” and so on!

Having said that now, let's look at how it impacts businesses. In any user verification flow, the standard name matching models that triangulate Indian names on the National and State ID cards such as PAN, Aadhaar, Driving Licence, Voter ID etc. can have false negatives to the tune of upto 25% of users (depending on the geographical spread of customer database and so on).

Effectively, 1 in 4 users have to go through additional cycle of manual reviews.

This results in:

  • High customer drop-offs
  • Reduced customer satisfaction
  • Operational inefficiencies - increased manual verification
  • Loss of potential business to competition

Challenges with Existing Name-Matching Models

Upon analysing the performance of different matching models, we found that none of the existing solutions is able to solve the above challenges in entirety. Most of these issues can be summarised as:

  • Significant Increase in number of false positives / false negatives stemming from routine algorithms written for instant results
  • Difficulty in interpreting stop words
  • Handling data sets involving multiple keys, each formed from a different string field
  • All the content-based methods fail to look at co-occurrence or contextual similarities and thus fail to identify similar entities

Transparent Tuning and Customisation

An ideal algorithm should have the following characteristics:

  1. Provisions for names that sound the same
  2. Liberal rules around order of the first name / last name
  3. Smart logics around provisions for honorifics, prefixes and suffixes
  4. Configurable provisions to suit the needs of specific business situations

HyperVerge Name Matching API

Hyperverge’s new and improved name matching API does all of the above and a little bit more.

Akin to a chef secrets’ sauce, our proprietary algorithm which has been calibrated using 410+ million checks has fine tuned name matching models which has resulted in increased accuracies by 15-25% with our existing clients.

HyperVerge AI solves these challenges by blending machine learning with traditional name matching techniques, such as name lists, common key, and rules, to determine a match score. This score can also consider fuzzy matches in other fields (including address and date of birth).

We use the best-in-class phonetic algorithm using vowels for identifying phonetic similarities in two names, like Bhartee Gupta and Bharti Gupta. Our solution also accounts for challenges like removing titles and honorifics (Dr., Ms., etc.), missing name components (John Claude Van Damme vs Van Damme), truncated and out of order name components. For each of the above challenges, we have a different strategy that we follow for name matching.

Business Impact and Concluding Thoughts

We have improvements in accuracies of upto 15% depending on the datasets that clients provide us.

In addition to being able to perform name (field) matching at a large scale, our design for the large-scale name-matching API also met our client’s goals in terms of both speed and accuracy.

As banal as it may sound, no single name-matching method can address all the nuances found in textual data. Most methods will accomplish 80 percent of a design solution. It is the final 20 percent that requires experience and ingenuity. With HyperVerge, you can experience the best in class onboarding process.

HyperVerge has enabled large organizations to safely authenticate and/or onboard millions of users over the past decade with minimal onboarding effort and turnaround time while ensuring protection against any fraudulent activity.

Large customers in telecom (Reliance Jio, Vodafone, etc), lending (Aditya Birla Capital, L&T Financial, EarlySalary, etc), securities (ICICI Securities, Angel Broking, Groww, etc), payments (Razorpay), e-commerce (Swiggy) and other industries trust HyperVerge's onboarding solutions to safely onboard their users.

To speak to one of our solution experts, reach out to us at contact@hyperverge.co.

Related Articles

Stay in the know

Get the latest product and management insights.
Curve lines for BG