2018/02/26 by Willem Stam.
Case study – Why typo domains can cause severe email delivery issues
In our daily email delivery monitoring activities, we recently saw the damage that typo domains can do. In this case study, simple typos in domains caused severe email delivery issues. We will recap what happened and how we solved the issue. Hopefully, other senders can avoid this kind of situation in the future.
Postmastery, we have a problem. No emails are going out!
Recently, one of our clients – a French medium sized ESP – contacted us because email messages weren’t going out. After looking at his server and our Delivery Analytics dashboard, we saw right away that all the connection slots were full, and queues were growing. The client’s PowerMTA was unable to open new connections. When we listed active SMTP connections, we saw that lots of them were on a single MX and they were all stalled.
Email addresses ‘hijacked’ by domain typos
The MX server was pretty intriguing. Basically, the MX created MX records for lots of typos in .fr domains (e.g. they created MX records for homail.fr, hormail.fr, otmail.fr, hoymail.fr, livr.fr, etc.). Then, when the MX received an email sent to a typo address, the MX server sent a message to the assumed intended recipient (for example after replacing homail.fr by hotmail.fr) saying that they could forward the original message in exchange for his approval to receive promotional messages. This is quite illegal.
Root cause: slow MX server
The root cause of the issue was the low performance of the MX server. Our client wanted to send campaigns for a new customer whose database contained a lot of misspelled addresses on the typo domains. Without any connection slots left, PowerMTA got stuck.
Our fix: use the roll up feature in PowerMTA or bypass the MX
To solve this issue, we used the roll up feature in PowerMTA to group all domains for this MX in the same queues.
<mx-rollup-list> (...) mx MX1.[name].com [name].rollup </mx-rollup-list>
Then we limited the number of connections to this MX to 1 per queue
<domain [name].rollup> max-smtp-out 1 </domain>
This fixed the problem.
But we were still wondering if we should not go a bit further. Would it be better (and more accurate) to bypass this MX and simply report messages sent to these ‘hijacked’ typo domains as bounce (categorized as bad-domain) to our customers?
In this case, the config in PowerMTA would be something like this:
<domain [name].rollup> reroute-to-virtual-mta bad-domain-discard backoff-reroute-to-virtual-mta bad-domain-discard </domain> <virtual-mta bad-domain-discard> <domain *> type discard discard-as-bounce true </domain> </virtual-mta> <bounce-category-pattern> ... /x-pmta;delivered to discard queue .*bad-domain-discard/ bad-domain </bounce-category-pattern>
Conclusion: make sure both email analytics and expertise are in place
If you want to debug sudden severe email delivery issues, make sure you have a clear real-time view of what is happening – email delivery analytics tooling is a must-have. Further, make sure you are equipped with the right level of knowledge to analyze the issue efficiently and effectively.
For more information on how Postmastery can help, just send us a message via our website.