Content, trigger words and subject lines

There’s been quite a bit of traffic on twitter this afternoon about a recent blog post by Hubspot identifying trigger words senders should avoid in an email subject line. A number of email experts are assuring the world that content doesn’t matter and are arguing on twitter and in the post comments that no one will block an email because those words are in the subject line.
As usually, I think everyone else is a little bit right and a little bit wrong.
The words and phrases posted by Hubspot are pulled out of the Spamassassin rule set. Using those words or exact phrases will cause a spam score to go up, sometimes by a little (0.5 points) and sometimes by a lot (3+ points). Most spamassassin installations consider anything with more than 5 points to be spam so a 3 point score for a subject line may cause mail to be filtered.
The folks who are outraged at the blog post, though, don’t seem to have read the article very closely. Hubspot doesn’t actually say that using trigger words will get mail blocked. What they say is a lot more reasonable than that.

Trigger words are known to cause problems and increase the chances of your email getting caught in a SPAM trap. By avoiding these words in your email subject lines, you can dramatically increase your chances of getting beyond SPAM filters.

OK, so I’m not sure about the “dramatic” part, as some of the words they list as triggers in the subject lines will also trigger scores when used in the body of the message. But the gist of the Hubspot post is not wrong. If you use too many words and phrases used by spammers, then your mail is going to be difficult to distinguish from spam. I don’t think this is actually controversial (although I’ve been known to be wrong…)
But some of the comments on the post go too far in the other direction and totally misrepresent reality.

Content filtering hasn’t been a big component of spam filtering algorithms for nearly a decade.

This is blatantly and demonstrably untrue. Naive content filtering hasn’t been a big component for nearly a decade, but content filtering is where filtering is going. IP based filtering is good for some things but content filtering allows for much finer grained sorting and filtering. I think content filtering is where the industry is going. Too many spammers have created too many ways to avoid and subvert IP based filters for them to be the full solution to protecting users.
Content matters, don’t think it doesn’t. But don’t let word lists like the above frighten you off from crafting good subject lines.

Related Posts

Content based filters

Content based filters are incredibly complex and entire books could be written about how they work and what they look at. Of course, by the time the book was written it would be entirely obsolete. Because of their complexity, though, I am always looking for new ways to explain them to folks.
Content based filters look at a whole range of things, from the actual text in the message, to the domains, to the IP addresses those domains and URLs point to. They look at the hidden structure of an email. They look at what’s in the body of the message and what’s in the headers. There isn’t a single bit of a message that content filters ignore.
Clients usually ask me what words they should change to avoid the filters. But this isn’t the right question to ask. Usually it’s not a word that causes the problem. Let me give you a few examples of what I mean.
James H. has an example over on the Cloudmark blog of how a single missing space in an email caused delivery problems for a large company. That missing space changed a domain name in the message sufficiently to be caught by a number of filters. This is one type of content filter, that focuses on what the message is advertising or who the beneficiary of the message is. Some of my better clients get caught by these types of filters occasionally. A website they’re linking to or a domain name they’re using in the text of the message has a bad reputation. The mail gets bulked or blocked because of that domain in the message.
One of my clients went from 100% inbox every day to random failures at different domains. Their overall inbox was still in the 96 – 98% range, but there was a definite change. The actual content of their mail hadn’t changed, but we kept looking for underlying causes. At one point we were on the phone and they mentioned their new content management system. Sure enough, the content management company had a poor reputation and the delivery problems started exactly when they started using the content management. The tricky part of this was that the actual domains and URLs in the messages never changed, they were still clickthrough.clientdomain.example.com. But those URLs now pointed to an IP address that a lot of spammers were abusing. So there were delivery problems. We made some changes to their setup and the delivery problems went away.
The third example is one from quite a long time ago, but illustrates a key point. A client was testing email sends through a new ESP. They were sending one-line mail through the ESPs platform to their own email account. Their corporate spamfilter was blocking the mail. After much investigation and a bit of string pulling, I finally got to talk to an engineer at the spamfiltering company. He told me that they were blocking the mail because it “looked like spam.” When pressed, he told me they blocked anything that had a single line of text and an unsubscribe link. Once the client added a second line of text, the filtering issue went away.
These are just some of the examples of how complex content based filters are. Content is almost a misnomer for them, as they look at so many other things including layout, URLs, domains and links.

Read More

Listen to me talk about filtering, blocklists and delivery

I did an interview with Practical eCommerce a few weeks ago. The podcast and transcript are now available.
I want to thank Kerry and the rest of the staff there for the opportunity to talk email and filtering with their readers.
Happy Thanksgiving everyone in the US.

Read More

Censorship, email and politics

Spamfiltering blocks email. This is something we all know and understand. For most people, that is everyone who doesn’t manage an email server or work in the delivery field or create spamfilters, filtering is a totally unseen process. The only time the average person notices filters is when they break. The breakage could be blocking mail they shouldn’t, or not blocking mail they should.
Yesterday, a bunch of people noticed that Yahoo was blocking mail containing references to a protest against Wall Street. This understandably upset people who were trying to use email as a communication medium. Many people decided it was Yahoo (a tool of the elites!) attempting to censor their speech and stop them from organizing a protest.
Yeah. Not so much.
Yahoo looked into it and reported that the mail had gotten caught in their spam filters. Yahoo adjusted their filters to let the mail through and all was (mostly) good.
I don’t think this is actually a sign of filters being broken. The blocked mail all contained a URL pointing to a occupywallst.com. I know there was a lot of speculation about what was being blocked, but sources tell me it was the actual domain. Not the phrase, not the text, the domain.
The domain was in a lot of mostly identical mail coming out of individual email accounts. This is a current hallmark of hijacked accounts. Spammers compromise thousands of email accounts, and send a few emails out of each of them. Each email is mostly identical and points to the same URL. Just like the protest mail.
There was also a lot of bulk mail being sent with that URL in it. I’ve been talking to friends who have access to traps, and they were seeing a lot of mail mentioning occupywallst.com in their traps. This isn’t surprising, political groups have some horrible hygiene. They are sloppy with acquisition, they trade names and addresses like kids trade cold germs, they never expire anything out. It’s just not how politics is played. And it’s not one party or another, it’s all of them. I’ve consulted with major names across the political spectrum, and none actually implement best practices.
As I have often said the secret to delivery is to not have your mail look like spam. In this case, the mail looked like spam. In fact, it looked like spam that was coming from hijacked accounts as well as spam sent by large bulk mailers. I suspect there was also a high complaint rate as people sent it to friends and family who really didn’t want to hear about the protests.
To Yahoo!’s credit, though, someone on staff was on top of things. They looked into the issue and the filter was lifted within a couple hours of the first blog post. A human intervened, overruled the algorithm and let the mail out.
I bet this is one of the few times anyone has seen that Yahoo does outbound filtering. Given it’s a politically charged situation, I can see why they assume that Yahoo is filtering because of politics and censorship. They weren’t though.
More on politics, filtering and censorship.

They’re not blocking you because they hate you

It really can be your email
More on Truthout
Another perspective on the politico article

Read More