Normalizing Email Addresses for minFraud

When providing an email address as an input to the minFraud services, you can provide it either as plain text or as an MD5 hash.

If you provide the email as an MD5 hash, it’s important that you normalize it before generating the hash. Otherwise minor, inconsequential differences could cause minFraud to consider it a different address.

Our client APIs do this for you if you enable sending the MD5 hash. This is the recommended way to do this.

Normalizing email addresses

If you are not able to use our client APIs, you can normalize an email address yourself. Below are the steps to take to do this.

  1. Trim whitespace from both ends of the address.
  2. Lowercase the address.
  3. Find the local part of the email (before the @) and the domain (after the @).
  4. Apply NFC normalization to the email local part.
  5. Trim whitespace from the beginning of the domain.
  6. If the domain ends with any number of periods, trim them off.
  7. Convert international domain names (IDNs) to ASCII. For example, you can do this in Java using java.net.IDN.
  8. If the domain ends with a repetition of .com (.com.com, .com.com.com, etc.), replace with a single .com.
  9. If the domain is gmail.com with any leading digits, it is replaced with gmail.com (i.e., 123gmail.com is replaced with gmail.com).
  10. Check for typos in the TLD and correct them. For a complete list of typos we correct, consult the normalization code in one of our client APIs below.
  11. Check for typos in the domain name and correct them. For a complete list of typos we correct, consult the normalization code in one of our client APIs below.
  12. If the domain is fastmail.com or any of the fastmail domains, replace the email local part with the subdomain (i.e., alias@user.fastmail.com is replaced with user@fastmail.com).
  13. If the domain has an equivalent, such as googlemail.com to gmail.com, replace it with the equivalent. For the list of equivalent domains we use, consult the normalization code in one of our client APIs below.
  14. Remove alias parts from the local part. For addresses at the yahoo.com domain, or other domains affiliated with Yahoo, this is everything after and including the first - character, if present. For addresses with all other domains, this is everything after and including the first + character, if present.
  15. Remove periods from gmail.com local parts.
  16. Put the local part and the domain back together to form the normalized email address.
  17. Calculate the MD5 hash.

Examples

You can review the code in our client APIs see how to normalize an email address in various languages.