Normalizing Email Addresses for minFraud
When providing an email address as an input to the minFraud services, you can provide it either as plain text or as an MD5 hash.
If you provide the email as an MD5 hash, it’s important that you normalize it before generating the hash. Otherwise minor, inconsequential differences could cause minFraud to consider it a different address.
Our client APIs do this for you if you enable sending the MD5 hash. This is the recommended way to do this.
Normalizing email addresses
If you are not able to use our client APIs, you can normalize an email address yourself. Below are the steps to take to do this.
- Trim whitespace from both ends of the address.
- Lowercase the address.
- Find the local part of the email (before the
@
) and the domain (after the@
). - Apply NFC normalization to the email local part.
- Trim whitespace from the beginning of the domain.
- If the domain ends with any number of periods, trim them off.
- Convert international domain names (IDNs) to ASCII. For example, you can do this in Java using java.net.IDN.
- If the domain ends with a repetition of
.com
(.com.com
,.com.com.com
, etc.), replace with a single.com
. - If the domain is
gmail.com
with any leading digits, it is replaced withgmail.com
(i.e.,123gmail.com
is replaced withgmail.com
). - Check for typos in the TLD and correct them. For a complete list of typos we correct, consult the normalization code in one of our client APIs below.
- Check for typos in the domain name and correct them. For a complete list of typos we correct, consult the normalization code in one of our client APIs below.
- If the domain is
fastmail.com
or any of the fastmail domains, replace the email local part with the subdomain (i.e.,alias@user.fastmail.com
is replaced withuser@fastmail.com
). - If the domain has an equivalent, such as
googlemail.com
togmail.com
, replace it with the equivalent. For the list of equivalent domains we use, consult the normalization code in one of our client APIs below. - Remove alias parts from the local part. For addresses at the
yahoo.com
domain, or other domains affiliated with Yahoo, this is everything after and including the first-
character, if present. For addresses with all other domains, this is everything after and including the first+
character, if present. - Remove periods from
gmail.com
local parts. - Put the local part and the domain back together to form the normalized email address.
- Calculate the MD5 hash.
Examples
You can review the code in our client APIs see how to normalize an email address in various languages.