Email filters and small sends

Have you heard about the Baader-Meinhoff effect?

The Baader-Meinhof effect, also known as frequency illusion, is the illusion in which a word, a name, or other thing that has recently come to one’s attention suddenly seems to appear with improbable frequency shortly afterwards (not to be confused with the recency illusion or selection bias). Baader–Meinhof effect at Wikipedia

There has to be an corollary for email. For instance, over the last week or so I’ve gotten an influx of questions about how to fix delivery for one to one email. Some have been from clients “Oh, while we’re at it… this happened.” Others have been from groups I’m associated with “I sent this message and it ended up in spam.”

The challenge is, what we do to fix delivery of bulk mail doesn’t really apply to one to one mail. The underlying theory is the same: Send mail people expect to receive and if it gets delivered to the bulk folder have them go fish it out. But when we are sending bulk mail we have a whole population of recipients to work with. When we’re sending one to one mail we only have one person to work with.

Most people don’t know what their filters do under the covers.I know we have a fairly stock install of SpamAssassin and there are some bayesian filters built into mail.app. It’s pretty easy to ID why something was filtered by SA, it tells you. But the built in filters are a black box. All I know is that they learn from what mail I mark as spam.

I can see the results. For instance, almost every time I do a password reset the “here’s your temporary password” message ends up in my spam folder. Doesn’t really matter what provider it is or how regularly I get mail from the vendor or anything like that. If I’m doing the password reset dance then 90% of the time I have to go dig the message out of my spam folder.

It’s possible that I could reset the filters built into mail.app and have this mail come into the inbox. But that will also mean I have to go retrain my filters by manually sorting through the 40 – 50 spams that get through spam assassin every day. That’s tedious and not a lot of fun and there’s no guarantee that the filters won’t re-learn that password reset style messages are spam more often than not.

“Why did this actual, real, one-to-one message go to spam?” is a question we can almost never answer. Sure, sometimes a domain reputation is bad enough or the message is in the wrong language or there’s something blindingly obvious with the content that makes it clear why the message went to spam. But those cases are not as common as we may like. Sometimes the filters just decide this mail should be delivered there.

The point here is that a lot of what we do for deliverability works for bulk, because we’re managing populations and statistics and are sending enough mail we can move the needle on machine learning. When we’re sending very small quantities of mail, then we’re relying on individual users knowing why their mail is going to bulk. Almost no one does, it’s just gotten way too complex.

Which leaves us in a position where email is unreliable for some forms of communication. I don’t think this is a permanent status. I think we’re in a period of filtering changes where folks are trying lots of things to see what works. A decade ago it was whitelists and blacklists and FBLs and paid certification. Now, it’s machine learning and recipient behaviour and individualised inbox experiences.

Filters are continually adapting to spammers. Spammers are continually adapting to filters. This competition is driving rapid evolution on both sides. It’s like punctuated equilibrium for email.

Related Posts

Content based filtering

Content filtering is often hard to explain to people, and I’m not sure I’ve yet come up with a good way to explain it.
A lot of people think content reputation is about specific words in the message. The traditional content explanation is that words like “Free” or too many exclamation points in the subject line are bad and will be filtered. But it’s not the words that are the issue it’s that the words are often found in spam. These days filters are a lot smarter than to just look at individual words, they look at the overall context of the message.
ISP_tolerances
Even when we’re talking content filters, the content is just a way to identify mail that might cause problems. Those problems are evaluated the same way IP reputation is measured: complaints, engagement, bad addresses. But there’s a lot more to content filtering than just the engagement piece. What else is part of content evaluation?

Read More

Politics and Delivery

Last week I posted some deliverability advice for the DNC based on their acquisition of President Obama’s 2012 campaign database. Paul asked a question on that post that I think is worth some attention.

Read More