Clicktracking link abuse

If you use redirection links in the emails you send out, where a click on the link goes to your server – so you can record that someone clicked – before redirecting to the real destination, then you’ve probably already thought about how they can be abused.
Redirection links are simple in concept – you include a link that points to your webserver in email that you send out, then when recipients click on it they end up at your webserver. Instead of displaying a page, though, your webserver sends what’s called a “302 redirect” to send the recipients web browser on to the real destination. How does your webserver know where to redirect to? There are several different ways, with different tradeoffs:

The simplest approach
The simplest sort of redirection link includes the final destination in the link itself – something like http://click.example.com/cnn.com/WORLD/. The webserver at click.example.com would simply strip off the first part of the link, and redirect to the remainder – cnn.com/WORLD/.
This is nice, because it’s fairly transparent to the recipient – when they hover over the link in their mail client or webmail it’ll be fairly clear where it’s going.
But it has several limitations. One is that you can’t really record very much data about the click – you know where it was redirecting to, but almost nothing else.
The bigger problem is that it’s very easy for a spammer to abuse – they can send out spam that has the link http://click.example.com/onlinepharmacy.ru/order.html, to hide their real link from spam filters, and your webserver will happily redirect recipients to go there. Or, worse, that can be used to redirect to a website hosting viruses. That can cause all sorts of problems for your reputation, up to and including having your redirection webserver blacklisted by antivirus and antiphishing organizations, meaning it’ll be blocked by many web browsers.
Add some metadata
Some of the things you might want to be able to record about a click would be which customers mail it was found in, which mailing campaign and which recipient it was sent to. This would let you do more sensible reporting and click-tracking, and also let you spot when a link is misused in some way (for example, thousands of clicks on a url that was sent to just one recipient).
That might look like http://click.example.com/123/456/789/cnn.com/WORLD/. Your webserver would strip off the first four parts, recording a click for customer 123, campaign 456 and recipient 789, then redirect to the remainder – cnn.com/WORLD/
This lets you do better reporting and is still fairly transparent to the recipient, but can still be abused in the same way.
Use a database
If you stored every link you wanted to redirect to in  a database you could simply store a unique key for each link – so you might record that key 2718 means http://cnn.com/WORLD/. Then the redirection URL might look like http://click.example.com/123/456/789/2718
This lets you do good reporting and is much more difficult for spammers to abuse (but not impossible – if the spammer signs up for a free or demo account on your system, then sends a test email to themselves, they can then reuse the links that they received in that mail).
But it’s fairly opaque to the recipient – they have no idea where the link will go. And it requires maintaining a database of every link you’ve ever used, for as long as it’s valuable (which could easily be several years if a recipient goes back to an old newsletter) and requires a database lookup for every click – which adds a fair bit of infrastructure you need to keep working 24/7 just to make links work.
Use a database and a cosmetic link
You could take the database format and add the final destination link on the end – like this http://click.example.com/123/456/789/2718/cnn.com/WORLD/ – and then just ignore everything after the url key (2718). That’ll work exactly the same way, but the final destination will be fairly transparent to the recipient.
This still can’t be abused by spammers, as if they try to use http://click.example.com/123/456/789/2718/mypharmacy.ru, it’ll still just redirect to http://cnn.com/WORLD/ as the only meaningful bit of the redirection link is the “2718“.
Cryptographically sign your links
A different approach is to record all the information you need in the link and to also add a cryptographic signature to prevent people from misusing it. This is much simpler than the word “cryptography” suggests, you just need to use a magic word (we’ll use “albatross”) and know about the md5() function.
You start off with the same destination string we used in Add some metadata – “/123/456/789/cnn.com/WORLD/“. Then you add the magic word on the end, to give “/123/456/789/cnn.com/WORLD/albatross“, and take the md5 “hash” of that. That’s some cryptographic black magic that’ll give you a string of letters and numbers that’s a “fingerprint” of that string. It’ll look something like “609a78b941bdf9f045cadcfa2e09d54c“. Then you combine that with the destination string to look like this:
http://click.example.com/609a78b941bdf9f045cadcfa2e09d54c/123/456/789/cnn.com/WORLD/
Then, when your webserver sees this link it splits it into the hash (609a78b941bdf9f045cadcfa2e09d54c) and destination string (/123/456/789/cnn.com/WORLD/). It then does exactly the same thing you did when you created the link – appends the magic word to the destination string to give “/123/456/789/cnn.com/WORLD/albatross” and takes the md5 hash of that string. If the result of that matches the hash in the link, it knows it’s a valid redirection link and it can record the click-tracking data and forward to the destination link. If the result doesn’t match it knows that the link has been tampered with, and can return an error page.
To generate the link in PHP would be something like this:

$destination = "/$customerid/$campaignid/$recipientid/$link";
$clicktrack = 'http://click.example.com/' . md5($destination . 'albatross') . $destination;

This is much cheaper to generate and validate than using a database, even a typical in-memory database.
Which to use?
Don’t use the simple approach – it’ll get abuse sooner or later and you’ll regret it. Any of the database or cryptographic approaches work just fine, though the cryptographic approach may be easier to scale up and maintain. The database approaches make it easier to disable a link, or direct it to somewhere else at a later point, in case of abuse or some other need.
What else is it good for?
You can use the same sort of approach to validate unsubscription links and VERP return paths for bounce handling. And “open tracking” using these sort of links for image URLs, if you find that a useful metric to offer.

Related Posts

Poor delivery can't be fixed with technical perfection

There are a number of different things delivery experts can do help senders improve their own delivery. Yes, I said it: senders are responsible for their delivery. ESPs, delivery consultants and deliverability experts can’t fix delivery for senders, they can only advise.
In my own work with clients, I usually start with making sure all the technical issues are correct. As almost all spam filtering is score based, and the minor scores given to things like broken authentication and header issues and formatting issues can make the difference between an email that lands in the inbox and one that doesn’t get delivered.
I don’t think I’m alone in this approach, as many of my clients come to me for help with their technical settings. In some cases, though, fixing the technical problems doesn’t fix the delivery issues. No matter how much my clients tweak their settings and attempt to avoid spamfilters by avoiding FREE!! in the subject line, or changing the background, they still can’t get mail in the inbox.
Why not? Because they’re sending mail that the recipients don’t really want, for whatever reason. There are so many ways a sender can collect an email address without actually collecting consent to send mail to that recipient. Many of the “list building” strategies mentioned by a number of experts involve getting a fig leaf of permission from recipients without actually having the recipient agree to receive mail.
Is there really any difference in permission between purchasing a list of “qualified leads” and automatically adding anyone who makes a purchase at a website to marketing lists? From the recipient’s perspective they’re still getting mail they don’t want, and all the technical perfection in the world can’t overcome the negative reputation associated with spamming.
The secret to inbox delivery: don’t send mail that looks like spam. That includes not sending mail to people who have not expressly consented to receive mail.

Read More

Troubleshooting the simple stuff

I was talking with one of my Barry pals recently and was treated to a rant regarding deliverability experts that can’t manage simple things. We’ve been having an ongoing conversation recently about the utterly stupid and annoying questions some senders ask. Last week, I was ranting about a delivery person asking what “5.7.1. Too many receipts this session” meant. This morning I got an IM.

Read More

Analysing lead-gen spam

Yesterday I showed how major companies hire hard core spammers.
Today I’m going to show you some of the technical details as to how I found that data. This is a fairly quick and shallow analysis, the sort of thing I’d typically do for a client to help them decide whether the case was worth pursuing before expending too much money and time on investigation and legal paperwork. I’ve also done it using standard command line tools that are available on pretty much any unix command line (and windows, with a little effort).
There are several questions to answer about the email in question.

Read More