Email filters and small sends

Have you heard about the Baader-Meinhoff effect?

The Baader-Meinhof effect, also known as frequency illusion, is the illusion in which a word, a name, or other thing that has recently come to one’s attention suddenly seems to appear with improbable frequency shortly afterwards (not to be confused with the recency illusion or selection bias). Baader–Meinhof effect at Wikipedia

There has to be an corollary for email. For instance, over the last week or so I’ve gotten an influx of questions about how to fix delivery for one to one email. Some have been from clients “Oh, while we’re at it… this happened.” Others have been from groups I’m associated with “I sent this message and it ended up in spam.”

The challenge is, what we do to fix delivery of bulk mail doesn’t really apply to one to one mail. The underlying theory is the same: Send mail people expect to receive and if it gets delivered to the bulk folder have them go fish it out. But when we are sending bulk mail we have a whole population of recipients to work with. When we’re sending one to one mail we only have one person to work with.

Most people don’t know what their filters do under the covers.I know we have a fairly stock install of SpamAssassin and there are some bayesian filters built into mail.app. It’s pretty easy to ID why something was filtered by SA, it tells you. But the built in filters are a black box. All I know is that they learn from what mail I mark as spam.

I can see the results. For instance, almost every time I do a password reset the “here’s your temporary password” message ends up in my spam folder. Doesn’t really matter what provider it is or how regularly I get mail from the vendor or anything like that. If I’m doing the password reset dance then 90% of the time I have to go dig the message out of my spam folder.

It’s possible that I could reset the filters built into mail.app and have this mail come into the inbox. But that will also mean I have to go retrain my filters by manually sorting through the 40 – 50 spams that get through spam assassin every day. That’s tedious and not a lot of fun and there’s no guarantee that the filters won’t re-learn that password reset style messages are spam more often than not.

“Why did this actual, real, one-to-one message go to spam?” is a question we can almost never answer. Sure, sometimes a domain reputation is bad enough or the message is in the wrong language or there’s something blindingly obvious with the content that makes it clear why the message went to spam. But those cases are not as common as we may like. Sometimes the filters just decide this mail should be delivered there.

The point here is that a lot of what we do for deliverability works for bulk, because we’re managing populations and statistics and are sending enough mail we can move the needle on machine learning. When we’re sending very small quantities of mail, then we’re relying on individual users knowing why their mail is going to bulk. Almost no one does, it’s just gotten way too complex.

Which leaves us in a position where email is unreliable for some forms of communication. I don’t think this is a permanent status. I think we’re in a period of filtering changes where folks are trying lots of things to see what works. A decade ago it was whitelists and blacklists and FBLs and paid certification. Now, it’s machine learning and recipient behaviour and individualised inbox experiences.

Filters are continually adapting to spammers. Spammers are continually adapting to filters. This competition is driving rapid evolution on both sides. It’s like punctuated equilibrium for email.

Related Posts

Do system administrators have too much power?

Yesterday, Laura brought a thread from last week to my attention, and the old-school ISP admin and mail geek in me felt the need to jump up and say something in response to Paul’s comment. My text here is all my own, and is based upon personal experience as well as those of my friends. That said, I’m not speaking on their behalf, either. 🙂
I found Paul’s use of the word ‘SysAdmin’ to be a mighty wide (and — in my experience — probably incorrect) brush to be painting with, particularly when referring to operations at ISPs with any significant number of mailboxes. My fundamental opposition to use of the term comes down to this: It’s no longer 1998.
The sort of rogue (or perhaps ‘maverick’) behavior to which you refer absolutely used to be a thing, back when a clean 56k dial-up connection was the stuff of dreams and any ISP that had gone through the trouble to figure out how to get past the 64k user limit in the UNIX password file was considered both large and technically competent. Outside of a few edge cases, I don’t know many system administrators these days who are able to (whether by policy or by access controls) — much less want to — make such unilateral deliverability decisions.
While specialization may be for insects, it’s also inevitable whenever a system grows past a certain point. When I started in the field, there were entire ISPs that were one-man shows (at least on the technical side). This simply doesn’t scale. Eventually, you start breaking things up into departments, then into services, then teams assigned to services, then parts of services assigned to teams, and back up the other side of the mountain, until you end up with a whole department whose job it is to run one component of one service.
For instance, let’s take inbound (just inbound) email. It’s not uncommon for a large ISP to have several technical teams responsible for the processing of mail being sent to their users:

Read More

Troubleshooting delivery is hard, but doable

Even for those of us who’ve been around for a while, and who have a lot of experience troubleshooting delivery problems things are getting harder. It used to be we could identify some thing about an email and if that thing was removed then the email would get to the inbox. Often this was a domain or a URL in the message that was triggering bulk foldering.
Filters aren’t so simple now. And we can’t just randomly send a list of URLs to a test account and discover which URL is causing the problem. Sure, one of the URLs could be the issue, but that’s typically in context with other things. It’s rare that I can identify the bad URLs sending mail through my own server these days.
There are also a lot more “hey, help” questions on some of the deliverability mailing lists. Most of these questions are sticky problems that don’t map well onto IP or domain reputation.
One of my long term clients recently had a bad mail that caused some warnings at Gmail.
We tried a couple of different things to try and isolate the problem, but never could discover what was triggering the warnings. Even more importantly, we weren’t getting the same results for identical tests done hours apart. After about 3 days, all the warnings went away and all their mail was back in the inbox.
It seemed that one mailing was really bad and resulted in a bad reputation, temporarily. But as the client fixed the problem and kept mailing their reputation recovered.
Deliverability troubleshooting is complicated and this flowchart sums up what it’s like.

Here at Word to the Wise, we get a lot of clients who have gone through the troubleshooting available through their ESPs and sometimes even other deliverability consultants. We get the tough cases that aren’t easy to figure out.
What we do is start from the beginning. First thing is to confirm that there aren’t technical problems, and generally we’ll find some minor problems that should be fixed, but aren’t enough to cause delivery problems. Then we look at the client’s data. How do they collect it? How do they maintain it? What are they doing that allows false addresses on their list?
Once we have a feel for their data processes, we move on to how do we fix those processes. What can we do to collect better, cleaner data in the future? How can we improve their processes so all their recipients tell the ISP that this is wanted mail?
The challenging part is what to do with existing data, but we work with clients individually to make sure that bad addresses are expunged and good addresses are kept.
Our solutions aren’t simple. They’re not easy. But for clients who listen to us and implement our recommendations it’s worth it. Their mail gets into the inbox and deliverability becomes a solved problem.

Read More

Updating the filtering model

One thing I really like about going to conferences is they’re often one of the few times I get to sit and think about the bigger email picture. Hearing other people talk about their marketing experiences, their email experiences, and their blocking experiences usually triggers big picture style thoughts.
Earlier this week I was at Activate18, hosted by Iterable. The sessions I attended were interesting and insightful. Of course, I went to the deliverability session. While listening to the presentation, I realized my previous model of email filtering needed to be updated.

Read More