What happened on June 16, 2021, and how we make sure it never happens again

June 27, 2021 · written by Author Image SimpleLogin team

On June 16, 2021, we have experienced important delays in email delivery that has caused several issues. We are deeply sorry for this incident. Here’s what happened, the investigation, and how to prevent this incident from happening again in the future.

What happened

On this date, we have received emails from multiple users alerting us about hours of delay in email delivery. The same issue is also raised on SimpleLogin community on Reddit and Twitter.

Investigation

For some past incidents, simple operations like increasing the server capacity, restarting all email related programs can be enough but after looking at the server stats, the email delay doesn’t seem to be related to server capacity.

We noticed the Postfix queue is abnormally full and flushing doesn’t empty the queue fast enough.

Looking at the log, there was a lot of 421 Retry later SMTP status which suggests an issue with the rate-limiting algorithm. The logs also indicate that an abnormal amount of emails are coming for a single account.

For the context, we have our own rate-limiting algorithm to avoid too many emails from being forwarded to a single mailbox that might lead to SimpleLogin server being blocked by a mailbox service. At that time, we are using a “soft” implementation that simply delays the email delivery when that happens by returning to Postfix a 421 status code and leaving the delay job to Postfix. This implementation allows all emails will be eventually forwarded.

This implementation turns out to be harmful in this situation as Postfix queue is full and delays by consequent other users messages.

We have immediately alerted the user in question and disabled the aliases that receive too many emails. After that, the Postfix queue is quickly coming back to an empty state and the email delivery becomes instant again.

How to make sure this issue never happens again

This issue has allowed us to learn multiple important lessons

Conclusion

We are grateful for the support from our community during this incident. Though we have more and more automatic monitoring, please send us an email at hi[at]simplelogin.io, tag our Twitter account, or Reddit account whenever you notice anything abnormal.