Things Spammers Do

Much like every other day, I got some spam today. Here’s a lightly edited copy of it.
Let’s go through it and see what they did that makes it clear that it’s spam, which companies helped them out, and what you should avoid doing to avoid looking like these spammers…

Received: from [213.144.59.132] (114.sub-75-210-142.myvzw.com [75.210.142.114] by m.wordtothewise.com (Postfix) with SMTP id DEA552EAE2

This tells me it was sent from Verizon wireless network space – which means it’s almost certainly spam, as legitimate mail doesn’t come directly from cellphones or cellular access points, it comes from smarthosts. And it also tells me that the spammer is lying about who they are, claiming to be “[213.144.59.132]” when they’re really not.

X-Spam-Status: No, score=1.7 required=7.0 tests=HTML_EXTRA_CLOSE,HTML_MESSAGE, RCVD_IN_PBL,RDNS_DYNAMIC autolearn=disabled version=3.2.5

This line was added by SpamAssassin running on my mailserver. HTML_MESSAGE isn’t very interesting – it just says there was some HTML in the mail – but the others are fairly strong signs that it’s spam. HTML_EXTRA_CLOSE is one of many spamassassin rules based on the HTML content of the message being malformed in some way, suggesting it was created by badly written software such as spamware.
RCVD_IN_PBL and RDNS_DYNAMIC are both really strong signs that no email from this Verizon IP address is legitimate, but in different ways. RDNS_DYNAMIC shows that Verizon hasn’t done anything special with the IP address to suggest it might be a legitimate server – it’s in the vast wasteland of consumer IP addresses that nobody really cares about, and not somewhere you should expect legitimate mail from. RCVD_IN_PBL is much more specific – it tells us that Verizon explicitly told Spamhaus that no email should ever be emitted from here (a provider that cared about spam might actually block traffic on port 25 from that sort of space, but we’ll take what we can get). If you ever see either of these on mail, it’s spam.

From: “Tom Joelson” <Noreply234239-389512@qmail.com>

Legitimate mail would have a company name, or maybe a personal name I’d recognize in the “friendly from”. Strike one. Legitimate mail wouldn’t have the word “noreply” anywhere in it – telling your recipients you don’t want to hear from them is rather disrespectful. Strike two. Random numerics in the From field are really bad: as well as looking like you’re trying to pull a fast one they’d make it impossible for a recipient to whitelist your mail. That sort of thing is fine in the return path, as part of VERP encoding, but not in the From address that’s visible to the recipient. Strike three. Qmail.com is an asian freemail provider – legitimate bulk mail never claims to be from someone it isn’t, and is never from a freemail provider. Strike four.

… check out the attached brochure for more information …

There’s very seldom a legitimate reason to have an attachment in bulk email, for several reasons. The email should stand on it’s own, giving the recipient the information they need in a form that’s immediately visible in their mail client. Links to your web page, sure, but the mail should make sense on it’s own, with the links part of a call to action. If you’re sending out mail to existing customers it might occasionally be useful to attach a PDF copy of a catalogue or somesuch, but the content of the email should still stand on it’s own (and given the security flaws in PDF that allow it to be used as a payload for viruses I’d be wary about doing even that).

Click This Link to Stop Future Messages =
<mailto:listservices@gmx.com?subject=3DUnsubscribe%3A%20myemail@mydomain>

Sure, you should have an unsubscription link in the messages you send. But it should be to an unsubscription page on your webserver, not a mailto link that sends mail anywhere, let alone to a dubious freemail provider (I’m prepared to believe gmx.com has legitimate users, but I’ve never seen it used anywhere other than in spam). And the clumsy phrasing looks like an attempt to avoid naive content filters.
All these things told me, and would have told a decent spam filter, that this wasn’t legitimate mail. Let’s dig down further and see how the spammer tried to avoid being identified.
The attachment is an HTML document, and it’s been base64 encoded. There’s never a good reason to use base64 encoding for English language attachments, unless you consider hiding the content of your email from naive spam filters a good reason. Less naive spam filters will decode the attachment and look inside it anyway. And they might consider the dishonest use of base64 encoding a bad enough sign in itself.
We can easily decode the base64 by hand, either by using a web based decoder or from a random unix-ish commandline by typing “openssl enc -d -base64”, hitting return, pasting in the encoded text and hitting ctrl-d.
That gives us this:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”>
<html><head>
<META HTTP-EQUIV=”Refresh” CONTENT=”0; URL=http://cts.vresp.com/c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416″>
</head></body></html>

What that snippet of HTML will do when you open it is immediately redirect to the URL given in the middle. I recognize cts.vresp.com as VerticalResponse‘s clickthrough redirector, so it looks like a spammer created a test account at VerticalResponse in order to be able to abuse their redirector to hide the final destination. Naughty spammer.
If I didn’t recognize the URL as belonging to VerticalResponse, though, I’d visit the obvious webpages to see who it is. http://cts.vresp.com/ just tells me “Forbidden”. Bad.  http://vresp.com/ just says “hola”, which isn’t a good sign either. I’m not sure whether http://www.vresp.com/ is better or worse – it doesn’t mention the real company name and claims it’s “a domain that sends permission-based emails”. That’s really fishy, and looks just like many, many dedicated spammer domains. The lesson to learn is that if you use a domain in your email, then there should be a webserver at any of the related hostnames, it should tell anyone visiting it what the domain is used for, the real name of the company that’s operating it and provide a link to their corporate website.
Let’s see where the VerticalResponse redirector sends us to. This is pretty easy to do using telnet from a unix commandline or a windows command prompt. We’re looking at the URL http://cts.vresp.com/c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416, which I’m going to split into the host “cts.vresp.com” and the path “/c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416”. You just have to type the bits in blue, and remember to hit return twice after the Host: line.
steve@ubuntu:~$ telnet cts.vresp.com 80
Trying 74.116.90.234...
Connected to cts.vresp.com.
Escape character is '^]'.
GET /c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416 HTTP/1.1
Host: cts.vresp.com

HTTP/1.1 302 Found
Date: Mon, 14 May 2012 18:36:58 GMT
Server: Apache
Location: http://www.christiancafe.com/guests/join/indexc.jsp?id=11416
P3P: policyref="https://cts.vresp.com/w3c/p3p.xml", CP="CAO DSP COR IVAo IVDo OUR STP PUR COM NAV"
Cache-Control: max-age=0, no-store, no-cache, must-revalidate
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html


(If you try and do this yourself you’ll discover that VerticalResponse have already shut down this redirector in response to an abuse report. Thanks, VR.)
And christiancafe.com is our spammer.
There’s more I could say about how they’re hosting their website (on Amazon EC2 Web Services, with suspiciously short DNS TTLs) but I think this is more than enough for one blog post.

Related Posts

Content, trigger words and subject lines

There’s been quite a bit of traffic on twitter this afternoon about a recent blog post by Hubspot identifying trigger words senders should avoid in an email subject line. A number of email experts are assuring the world that content doesn’t matter and are arguing on twitter and in the post comments that no one will block an email because those words are in the subject line.
As usually, I think everyone else is a little bit right and a little bit wrong.
The words and phrases posted by Hubspot are pulled out of the Spamassassin rule set. Using those words or exact phrases will cause a spam score to go up, sometimes by a little (0.5 points) and sometimes by a lot (3+ points). Most spamassassin installations consider anything with more than 5 points to be spam so a 3 point score for a subject line may cause mail to be filtered.
The folks who are outraged at the blog post, though, don’t seem to have read the article very closely. Hubspot doesn’t actually say that using trigger words will get mail blocked. What they say is a lot more reasonable than that.

Read More

Content based filters

Content based filters are incredibly complex and entire books could be written about how they work and what they look at. Of course, by the time the book was written it would be entirely obsolete. Because of their complexity, though, I am always looking for new ways to explain them to folks.
Content based filters look at a whole range of things, from the actual text in the message, to the domains, to the IP addresses those domains and URLs point to. They look at the hidden structure of an email. They look at what’s in the body of the message and what’s in the headers. There isn’t a single bit of a message that content filters ignore.
Clients usually ask me what words they should change to avoid the filters. But this isn’t the right question to ask. Usually it’s not a word that causes the problem. Let me give you a few examples of what I mean.
James H. has an example over on the Cloudmark blog of how a single missing space in an email caused delivery problems for a large company. That missing space changed a domain name in the message sufficiently to be caught by a number of filters. This is one type of content filter, that focuses on what the message is advertising or who the beneficiary of the message is. Some of my better clients get caught by these types of filters occasionally. A website they’re linking to or a domain name they’re using in the text of the message has a bad reputation. The mail gets bulked or blocked because of that domain in the message.
One of my clients went from 100% inbox every day to random failures at different domains. Their overall inbox was still in the 96 – 98% range, but there was a definite change. The actual content of their mail hadn’t changed, but we kept looking for underlying causes. At one point we were on the phone and they mentioned their new content management system. Sure enough, the content management company had a poor reputation and the delivery problems started exactly when they started using the content management. The tricky part of this was that the actual domains and URLs in the messages never changed, they were still clickthrough.clientdomain.example.com. But those URLs now pointed to an IP address that a lot of spammers were abusing. So there were delivery problems. We made some changes to their setup and the delivery problems went away.
The third example is one from quite a long time ago, but illustrates a key point. A client was testing email sends through a new ESP. They were sending one-line mail through the ESPs platform to their own email account. Their corporate spamfilter was blocking the mail. After much investigation and a bit of string pulling, I finally got to talk to an engineer at the spamfiltering company. He told me that they were blocking the mail because it “looked like spam.” When pressed, he told me they blocked anything that had a single line of text and an unsubscribe link. Once the client added a second line of text, the filtering issue went away.
These are just some of the examples of how complex content based filters are. Content is almost a misnomer for them, as they look at so many other things including layout, URLs, domains and links.

Read More

Hunting the Human Representative

Yesterday’s post was inspired by a number of questions I’ve fielded recently from people in the email industry. Some were clients, some were colleagues on mailing lists, but in most cases they’d found a delivery issue that they couldn’t solve and were looking for the elusive Human Representative of an ISP.
There was a time when having a contact inside an ISP was almost required to have good delivery. ISPs didn’t have very transparent systems and SMTP rejection messages weren’t very helpful to a sender. Only a very few ISPs even had postmaster pages, and the information there wasn’t always helpful.
More recently that’s changed. It’s no longer required to have a good relationship at the ISPs to get inbox delivery. I can point to a number of reasons this is the case.
ISPs have figured out that providing postmaster pages and more information in rejection messages lowers the cost of dealing with senders. As the economy has struggled ISPs have had to cut back on staff, much like every other business out there. Supporting senders turned into a money and personnel sink that they just couldn’t afford any longer.
Another big issue is the improvement in filters and processing power. Filters that relied on IP addresses and IP reputation did so for mostly technical reasons. IP addresses are the one thing that spammers couldn’t forge (mostly) and checking them could be done quickly so as not to bottleneck mail delivery. But modern fast processors allow more complex information analysis in short periods of time. Not only does this mean more granular filters, but filters can also be more dynamic. Filters block mail, but also self resolve in some set period of time. People don’t need to babysit the filters because if sender behaviour improves, then the filters automatically notice and fall off.
Then we have authentication and the protocols now being layered on top of that. This is a technology that is benefiting everyone, but has been strongly influenced by the ISPs and employees of the ISPs. This permits ISPs to filter on more than just IP reputation, but to include specific domain reputations as well.
Another factor in the removal of the human is that there are a lot of dishonest people out there. Some of those dishonest people send mail. Some of them even found contacts inside the ISPs. Yes, there are some bad people who lied and cheated their way into filtering exceptions. These people were bad enough and caused enough problems for the ISPs and the ISP employees who were lied to that systems started to have fewer and fewer places a human could override the automatic decisions.
All of this contributes to the fact that the Human Representative is becoming a more and more elusive target. In a way that’s good, though; it levels the playing field and doesn’t give con artists and scammers better access to the inbox than honest people. It means that smaller senders have a chance to get mail to the inbox, and it means that fewer people have to make judgement calls about the filters and what mail is worthy or not. All mail is subject to the same conditions.
The Human Representative is endangered. And I think this is a good thing for email.

Read More