Email without filters

… or Find the False Positive.
Anyone sending a lot of email has complained about spam filters and false positives at some point. But most people haven’t run a mailbox with no spam filters in front of it in recent years, so don’t have much of a feel for what an unfiltered mailbox looks like, how important filters are and how difficult their job is.
I run no transaction level filters in front of my mailbox, just content filters that route mail to one of several inboxes or a junk folder, so if I want to I can look at what unfiltered email looks like. I took data from all mail that was sent to me yesterday, and put it in a format that really shows the problem filters face and especially the difficulty of spotting which mail in the junk folder is a false positive.
An inbox with no filters looks like this.

Running a spam filter against it, simply categorizing each email as spam (pink) or not-spam (green) looks like this.
 

Even with the messages categorized as spam vs not-spam it’s hard to work out which messages are important and which aren’t, let alone where the false positives might be.
If I sort the categories by hand you get this – where you can see that out of 1200 or so mails about three quarters were spam. Of the three false positives two were bulk email that I didn’t care that I didn’t receive and only one was email that I considered important.
 
 

Related Posts

Turn it all the way up to 11

I made that joke the other night and most of the folks who heard it didn’t get the reference. It made me feel just a little bit old.
Anyhow, Mickey beat me to it and posted much of what I was going to say about Ken Magill’s response to a very small quote from Neil’s guest post on expiring email headers last week.
I, too, was at that meeting, and at many other meetings where marketers and the folks that run the ISP spam filters end up in the same room. I don’t think the marketers always understand what is happening inside the postmaster and filtering desks on a day to day basis at the ISPs. Legitimate marketing? It’s a small fraction of the mail they deal with. Ken claims that marketing pays the salaries of these employees and they’d be out of a job if marketing didn’t exist. Possibly, but only in the context that they are paid to keep their employers servers up and running so that the giant promises made by the marketing team of faster downloads and better online experiences actually happen.
If there wasn’t an internet and there weren’t servers to maintain, they’d have good jobs elsewhere. They’d be building trains or designing buildings or any of the thousands of other jobs that require smart technical people.
Ken has no idea what these folks running the filters and keeping your email alive deal with on a regular basis. They deal with the utter dregs and horrors of society. They are the people dealing with unrelenting spam and virus and phishing attacks bad enough to threaten to take down their networks and the networks of everyone else. They also end up dealing with law enforcement to deal with criminals. Some of what they do is deal with is unspeakable, abuse and mistreatment of children and animals. These are the folks who stand in front of the rest of us, and make the world better for all of us.
They should be thanked for doing their job, not chastised because they’re doing what the people who pay them expect them to be doing.
Yes, recipients want the mail they want. But, y’know, I bet they really don’t want all the bad stuff that the ISPs protect against. Ken took offense at a statement that he really shouldn’t have. ISPs do check their false positive rates on filtering, and those rates are generally less than 1% of all the email that they filter. Marketers should be glad they’re such a small part of the problem. They really don’t want to be a bigger part.

Read More

Gmail and the PBL

Yesterday I wrote about the underlying philosophy of spam filtering and how different places have different philosophies that drive their filtering decisions. That post was actually triggered by a blog post I read where the author was asking why Gmail was using the PBL but instead of rejecting mail from PBL listed hosts they instead accepted and bulkfoldered the mail.
The blog post ends with a question:

Read More

Why do ISPs do that?

One of the most common things I hear is “but why does the ISP do it that way?” The generic answer for that question is: because it works for them and meets their needs. Anyone designing a mail system has to implement some sort of spam filtering and will have to accept the potential for lost mail. Even the those recipients who runs no software filtering may lose mail. Their spamfilter is the delete key and sometimes they’ll delete a real mail.
Every mailserver admin, whether managing a MTA for a corporation, an ISP or themselves inevitably looks at the question of false positives and false negatives. Some are more sensitive to false negatives and would rather block real mail than have to wade through a mailbox full of spam. Others are more sensitive to false positives and would rather deal with unfiltered spam than risk losing mail.
At the ISPs, many of these decisions aren’t made by one person, but the decisions are driven by the business philosophy, requirements and technology. The different consumer ISPs have different philosophies and these show in their spamfiltering.
Gmail, for instance, has a lot of faith in their ability to sort, classify and rank text. This is, after all, what Google does. Therefore, they accept most of the email delivered to Gmail users and then sort after the fact. This fits their technology, their available resources and their business philosophy. They leave as much filtering at the enduser level as they can.
Yahoo, on the other hand, chooses to filter mail at the MTA. While their spamfoldering algorithms are good, they don’t want to waste CPU and filtering effort on mail that they think may be spam. So, they choose to block heavily at the edge, going so far as to rate limit senders that they don’t know about the mail. Endusers are protected from malicious mail and senders have the ability to retry mail until it is accepted.
The same types of entries could be written about Hotmail or AOL. They could even be written about the various spam filter vendors and blocklists. Every company has their own way of doing things and their way reflects their underlying business philosophy.

Read More