How Spamfilters Work

AllSpammedUp has a post describing the primary techniques anti-spam filters use to identify mail as spam or not spam. While is this not sender or delivery focused knowledge, it is important for people sending mail to have a basic understanding of filtering mechanisms. Without that base knowledge, it’s difficult to troubleshoot problems and resolve issues.

Any anti-spam system that is worth using will contain a range of preventative measures and features that are used to determine whether an email is likely to be spam or not.  As a complete solution they can be very effective, but taken individually and their weaknesses become more apparent. […] when you combine a number of different techniques into a single system, with each technique applying a “likelihood” score to each email that is checked, the system can be quite effective.
For example, if an email is from an IP address that is not considered a likely spam source (no score increase), but contains spam-like content (score increased according to severity), and fails sender verification (increases score again) , the combined “likelihood” score may reach the configured threshold for the system and cause the email to be treated as spam.

This is the concept I try to convey by using my bucket metaphor.

Related Posts

Poor delivery is not always about spam

There are days I think we have trained people too well to believe every delivery problem is a misplaced spam block. We also have people trained to expect near 100% immediate delivery from send to inbox.
The problem is, email isn’t 100% reliable. It’s close. Very close. But sometimes mail just fails. It’s not because the ISPs hate you. It’s sometimes not even because the mail looks like spam.
Sometimes Mail Just Fails.
One of the challenges of working in email delivery is knowing enough to be able to separate out the random delivery failures from real delivery issues.

Read More

Who is Julia and why won't she leave me alone?

There seems to be some new spam software in use. Julia <random last name> keeps telling me about her new webcam, how much she wants to date me and wants to know when I want to visit. These spams started February 1. I’ve had 179 caught by my MUA filters, and 152 caught by spamassassin (SA score >7 are filtered to a special account).
This is exactly the type of pattern that causes people to write filters that years later people look at and ask why someone thought this was a reasonable marker for spam.
The good folks over at MailChimp have examined some of the scoring rules that their clients trigger. They found some “Julia” type markers. Some oddities they reported on:

Read More

Reputation: part 2

Yesterday, I posted about reputation as a combination of measurable statistics, like bounce rates and complaint rates and spamtrap hits. But some mailers who meet those reputation numbers are still seeing some delivery problems. When they ask places, like AOL, why their mail is being put into the bulk folder or blocked they are told that the issue is their reputation. This leads to confusion on the part of those senders because, to them, their reputation is fine. Their numbers are exactly where they were a few weeks ago when their delivery was fine.
What appears to have changed is how reputation is being calculated. AOL has actually been hinting for a while that they are looking at reputation, and even published a best practices document back in April. Based on what people are saying some of that change has started to become sender visible.
We know that AOL and other ISPs look at engagement, and that they can actually measure engagement a lot more accurately than sender can. Senders rely on clicks and image loading to determine if a user opened an email. ISPs, particularly those who manage the email interface, can measure the user actively opening the email.
We also know that ISPs measure clicks. Not just “this is spam” or “this is not spam” clicks in the interface, but they know when a link in an email has been clicked as well.
I expect that both these measures are now a more formal and important part of the AOL reputation magic.
In addition to the clicks, I would speculate that AOL is now also looking at the number of dead addresses on a list. It is even possible they are doing something tricky like looking at the number of people who have a particular from address in their address book.
All ISPs know what percentage of a list is delivered to inactive accounts. After a long enough period of time of inactivity, mail to those accounts will be rejected. However for some period of time the accounts will be accepting mail. Sending a lot of mail to a lot of dead accounts is a sign of a mailer who is not paying attention to recipient engagement.
All ISPs with bulk folders have to know how many people have the from address in their address book. Otherwise, the mail would get delivered incorrectly. In this way, ISPs can monitor the “generic” recipient’s view of the email. Think of it as a similar to hitting the “this is not spam” button preemptively.
This change in reputation at the ISPs is going to force senders to change how they think of reputation, too. No longer is reputation all about complaints, it is about sending engaging and relevant email. The ISPs are now measuring engagement. They are measuring relevancy. They are measuring better than many senders are.
Senders cannot continue to accrete addresses on lists and continue sending email into the empty hole of an abandoned account while not taking a hit on their reputation. That empty hole is starting to hurt reputation much more than it helps reputation.

Read More