Censoring email

It seems some mail to Apple’s iCloud has been caught in filters. Apparently, a few months ago someone sent a script to a iCloud user that contained the phrase “barely legal teen” and Apple’s filters ate it.
The amount of hysteria that I’ve seen in some places about this, though, seems excessive. One of my favorite quotes was from MacWorld and just tells me that many of the people reporting on filtering have no idea how filters really work.

And it’s not as if there’s a lack of good, free email providers with years of spam-blocking experience: Google, Yahoo, and Microsoft all spring immediately to mind. And—as far as we know, anyway—those services aren’t “helpfully” blocking any emails without telling their users.

“As far as [you] know” isn’t very far, actually. These services block email all the time and normally don’t tell users about it. Hotmail is notorious for accepting email and then just silently dropping it on the floor. Yahoo doesn’t usually drop mail after it’s been accepted, but is very picky about what mail it accepts. About the only company mentioned that accepts everything is Gmail. And even then I know Gmail does, very rarely, block at the IP level.
Filters are complex and filters are extensive. I hate it when filters are responsible for losing legitimate mail but it happens. I’m pretty sure, though, that outside of the testing for the phrase “barely legal teen” that this is a filter phrase that has an extremely low false positive rate.
That’s the crux of what’s useful in filters: how much bad mail does this stop while letting as much good mail as possible through. If a particular filter catches lots of spam, and blocks only a tiny bit of real mail, it will be a useful filter. If it doesn’t catch much spam but also doesn’t block much real mail, it might be a useful filter. If it catches too much real mail, it’s not a useful filter.
As it is, Apple and their filtering vendor have adjusted their filters such that mail with the phrase “barely legal teen” is again making it into the inbox.
I’m not really sure this is a win.

Related Posts

Hunting the Human Representative

Yesterday’s post was inspired by a number of questions I’ve fielded recently from people in the email industry. Some were clients, some were colleagues on mailing lists, but in most cases they’d found a delivery issue that they couldn’t solve and were looking for the elusive Human Representative of an ISP.
There was a time when having a contact inside an ISP was almost required to have good delivery. ISPs didn’t have very transparent systems and SMTP rejection messages weren’t very helpful to a sender. Only a very few ISPs even had postmaster pages, and the information there wasn’t always helpful.
More recently that’s changed. It’s no longer required to have a good relationship at the ISPs to get inbox delivery. I can point to a number of reasons this is the case.
ISPs have figured out that providing postmaster pages and more information in rejection messages lowers the cost of dealing with senders. As the economy has struggled ISPs have had to cut back on staff, much like every other business out there. Supporting senders turned into a money and personnel sink that they just couldn’t afford any longer.
Another big issue is the improvement in filters and processing power. Filters that relied on IP addresses and IP reputation did so for mostly technical reasons. IP addresses are the one thing that spammers couldn’t forge (mostly) and checking them could be done quickly so as not to bottleneck mail delivery. But modern fast processors allow more complex information analysis in short periods of time. Not only does this mean more granular filters, but filters can also be more dynamic. Filters block mail, but also self resolve in some set period of time. People don’t need to babysit the filters because if sender behaviour improves, then the filters automatically notice and fall off.
Then we have authentication and the protocols now being layered on top of that. This is a technology that is benefiting everyone, but has been strongly influenced by the ISPs and employees of the ISPs. This permits ISPs to filter on more than just IP reputation, but to include specific domain reputations as well.
Another factor in the removal of the human is that there are a lot of dishonest people out there. Some of those dishonest people send mail. Some of them even found contacts inside the ISPs. Yes, there are some bad people who lied and cheated their way into filtering exceptions. These people were bad enough and caused enough problems for the ISPs and the ISP employees who were lied to that systems started to have fewer and fewer places a human could override the automatic decisions.
All of this contributes to the fact that the Human Representative is becoming a more and more elusive target. In a way that’s good, though; it levels the playing field and doesn’t give con artists and scammers better access to the inbox than honest people. It means that smaller senders have a chance to get mail to the inbox, and it means that fewer people have to make judgement calls about the filters and what mail is worthy or not. All mail is subject to the same conditions.
The Human Representative is endangered. And I think this is a good thing for email.

Read More

Everybody wins!

There was a recent question on a mailing list during a discussion of spam and delivery problems. A number of folks who work in delivery were discussing how a bad address got on a list. Someone who works on the spam blocking end of things asked why do you care how a bad address got onto a mailing list?
For recipients, they usually don’t care. They just want the unsolicited mail to stop. It’s a position I have no problem with; I want the unsolicited mail to stop, too. But understanding why a particular sender is sending mail to addresses that never asked for it can be an important step in making it stop. Not by the receivers and the spam filters, they’ll just block the bad sender and move on. Or if they’re an ISP or ESP they’ll just throw the sender off for AUP violations and let the sender be somebody else’s problem.
In the broader context, though, this only changes the source of the spam. It doesn’t help the victim; the bad sender can always find another host and they will continue to mail people who never asked for that mail. And, in fairness to these senders, often they are mailing lists of mixed sources. Some of the addresses didn’t opt-in, and don’t want the mail, but a lot of addresses on their list did opt-in and do want their mail. Fixing their problem means they can mail people who want their mail. The sender is happy, the recipients are happy and the receivers are happy; everybody wins!
Everybody winning is something I can get fully behind.

Read More

Why do ISPs do that?

One of the most common things I hear is “but why does the ISP do it that way?” The generic answer for that question is: because it works for them and meets their needs. Anyone designing a mail system has to implement some sort of spam filtering and will have to accept the potential for lost mail. Even the those recipients who runs no software filtering may lose mail. Their spamfilter is the delete key and sometimes they’ll delete a real mail.
Every mailserver admin, whether managing a MTA for a corporation, an ISP or themselves inevitably looks at the question of false positives and false negatives. Some are more sensitive to false negatives and would rather block real mail than have to wade through a mailbox full of spam. Others are more sensitive to false positives and would rather deal with unfiltered spam than risk losing mail.
At the ISPs, many of these decisions aren’t made by one person, but the decisions are driven by the business philosophy, requirements and technology. The different consumer ISPs have different philosophies and these show in their spamfiltering.
Gmail, for instance, has a lot of faith in their ability to sort, classify and rank text. This is, after all, what Google does. Therefore, they accept most of the email delivered to Gmail users and then sort after the fact. This fits their technology, their available resources and their business philosophy. They leave as much filtering at the enduser level as they can.
Yahoo, on the other hand, chooses to filter mail at the MTA. While their spamfoldering algorithms are good, they don’t want to waste CPU and filtering effort on mail that they think may be spam. So, they choose to block heavily at the edge, going so far as to rate limit senders that they don’t know about the mail. Endusers are protected from malicious mail and senders have the ability to retry mail until it is accepted.
The same types of entries could be written about Hotmail or AOL. They could even be written about the various spam filter vendors and blocklists. Every company has their own way of doing things and their way reflects their underlying business philosophy.

Read More