When did the reject happen?

conversation_for_blogEarlier today I approved a comment from Mike on a post about problems at AOL from 2012. The part of the comment that caught my attention:

SMTP error from remote mail server after end of data:
521 5.2.1 : AOL will not accept delivery of this message.

Mike also mentioned his IP reputation is good, when he checks at AOL so he doesn’t understand why mail is being blocked.
I think the big clue is after the end of data and would look at the full content of the mail, particularly domains and URLs, to identify is triggering the block.
In the SMTP transaction there are only a few places the ISP can stop the transaction and each spot tells us different things about why the ISP is rejecting the message.

After connection

A block after connection is a block either against the IP address or against the domain in the rDNS of the IP. IPs with no rDNS or generic DNS can also be blocked here. Blocks here do happen, but many recipients will let the SMTP transaction continue.

After HELO/EHLO

A block after HELO/EHLO is often a block against the domain in the HELO/EHLO or against a particular HELO/EHLO. Malware and bots often have distinctive HELO/EHLO patterns and it’s common for those kinds of senders to be blocked at this point.

After Mail From

A block after Mail From is often directed at the domain in the bounce string. Some senders do check to make sure the domain has a MX and will block if it doesn’t. Blocks don’t happen here very often.

After RCPT To

Blocks here are not always spam related. Most of the delivery failures at this point have to do with non-existent addresses.

After DATA

Blocks after data mean the ISP has actually seen the full content of the email. If a block comes after DATA the full content of the message including the recipient and their permission status should be evaluated as part of the determination about what is triggering the block.
Using when the rejection happened is an important part of understanding why a block happened. For instance, if a block happens before DATA, you know that content isn’t relevant, because the ISP never saw the content. If a block happened before Mail From: you know it’s the IP address reputation or configuration. If a block happened after DATA you know you need to look at the whole message.
 

Related Posts

Pattern matching primates

Why do we see faces where there are none? Paradolia
Why do we look at random noise and see patterns? Patternicity
Why do we think we have discovered what’s causing filtering if we change one thing and email gets through?
It’s all because we’re pattern matching primates, or as Michael Shermer puts it “people believe weird things because of our evolved need to believe nonweird things.”
Our brains are amazing and complex and filter a lot of information so we don’t have to think of it. Our brains also fill in a lot of holes. We’re primed at seeing patterns, even when there’s no real pattern. Our brains can, and do, lie to us all the time. For me, some of the important part of my Ph.D. work was learning to NOT trust what I thought I saw, and rather to effectively observe and test. Testing means setting up experiments in different ways to make it easier to not draw false conclusions.
Humans are also prone to confirmation bias: where we assign more weight to things that agree with our preconceived notions.
Take the email marketer who makes a number of changes to a campaign. They change some of the recipient targeting, they add in a couple URLs, they restructure the mail to change the text to image ratio and they add the word free to the subject line. The mail gets filtered to the bulk folder and they immediately jump to the word free as the proximate cause of the filtering. They changed a lot of things but they focus on the word free. 
Then they remove the word free from the subject line and all of a sudden the emails are delivering. Clearly the filter in question is blocking mail with free in the subject line.
Well, no. Not really. Filters are bigger and more complex than any of us can really understand. I remember a couple years ago, when a few of my close friends were working at AOL on their filter team. A couple times they related stories where the filters were doing things that not even the developers really understood.
That was a good 5 or 6 years ago, and filters have only gotten more complex and more autonomous. Google uses an artificial neural network as their spam filter.  I don’t really believe that anything this complex just looks at free in the subject line and filters based on that.
It may be that one thing used to be responsible for filtering, but those days are long gone. Modern email filters evaluate dozens or hundreds of factors. There’s rarely one thing that causes mail to go to the bulk folder. So many variables are evaluated by filters that there’s really no way to pinpoint the EXACT thing that caused a filter to trigger. In fact, it’s usually not one thing. It could be any number of things all adding up to mean this may not be mail that should go to the inbox.
There are, of course, some filters that are one factor. Filters that listen to p=reject requests can and do discard mail that fails authentication. Virus filters will often discard mail if they detect a virus in the mail. Filters that use blocklists will discard mail simply due to a listing on the blocklist.
Those filters address the easy mail. They leave the hard decisions to the more complex filters. Most of those filters are a lot more accurate than we are at matching patterns. Us pattern matching primates want to see patterns and so we find them.
 

Read More

Politics and Delivery

Last week I posted some deliverability advice for the DNC based on their acquisition of President Obama’s 2012 campaign database. Paul asked a question on that post that I think is worth some attention.

Read More

Do system administrators have too much power?

Yesterday, Laura brought a thread from last week to my attention, and the old-school ISP admin and mail geek in me felt the need to jump up and say something in response to Paul’s comment. My text here is all my own, and is based upon personal experience as well as those of my friends. That said, I’m not speaking on their behalf, either. 🙂
I found Paul’s use of the word ‘SysAdmin’ to be a mighty wide (and — in my experience — probably incorrect) brush to be painting with, particularly when referring to operations at ISPs with any significant number of mailboxes. My fundamental opposition to use of the term comes down to this: It’s no longer 1998.
The sort of rogue (or perhaps ‘maverick’) behavior to which you refer absolutely used to be a thing, back when a clean 56k dial-up connection was the stuff of dreams and any ISP that had gone through the trouble to figure out how to get past the 64k user limit in the UNIX password file was considered both large and technically competent. Outside of a few edge cases, I don’t know many system administrators these days who are able to (whether by policy or by access controls) — much less want to — make such unilateral deliverability decisions.
While specialization may be for insects, it’s also inevitable whenever a system grows past a certain point. When I started in the field, there were entire ISPs that were one-man shows (at least on the technical side). This simply doesn’t scale. Eventually, you start breaking things up into departments, then into services, then teams assigned to services, then parts of services assigned to teams, and back up the other side of the mountain, until you end up with a whole department whose job it is to run one component of one service.
For instance, let’s take inbound (just inbound) email. It’s not uncommon for a large ISP to have several technical teams responsible for the processing of mail being sent to their users:

Read More