Clicktracking link abuse

If you use redirection links in the emails you send out, where a click on the link goes to your server – so you can record that someone clicked – before redirecting to the real destination, then you’ve probably already thought about how they can be abused.
Redirection links are simple in concept – you include a link that points to your webserver in email that you send out, then when recipients click on it they end up at your webserver. Instead of displaying a page, though, your webserver sends what’s called a “302 redirect” to send the recipients web browser on to the real destination. How does your webserver know where to redirect to? There are several different ways, with different tradeoffs:

The simplest approach
The simplest sort of redirection link includes the final destination in the link itself – something like http://click.example.com/cnn.com/WORLD/. The webserver at click.example.com would simply strip off the first part of the link, and redirect to the remainder – cnn.com/WORLD/.
This is nice, because it’s fairly transparent to the recipient – when they hover over the link in their mail client or webmail it’ll be fairly clear where it’s going.
But it has several limitations. One is that you can’t really record very much data about the click – you know where it was redirecting to, but almost nothing else.
The bigger problem is that it’s very easy for a spammer to abuse – they can send out spam that has the link http://click.example.com/onlinepharmacy.ru/order.html, to hide their real link from spam filters, and your webserver will happily redirect recipients to go there. Or, worse, that can be used to redirect to a website hosting viruses. That can cause all sorts of problems for your reputation, up to and including having your redirection webserver blacklisted by antivirus and antiphishing organizations, meaning it’ll be blocked by many web browsers.
Add some metadata
Some of the things you might want to be able to record about a click would be which customers mail it was found in, which mailing campaign and which recipient it was sent to. This would let you do more sensible reporting and click-tracking, and also let you spot when a link is misused in some way (for example, thousands of clicks on a url that was sent to just one recipient).
That might look like http://click.example.com/123/456/789/cnn.com/WORLD/. Your webserver would strip off the first four parts, recording a click for customer 123, campaign 456 and recipient 789, then redirect to the remainder – cnn.com/WORLD/
This lets you do better reporting and is still fairly transparent to the recipient, but can still be abused in the same way.
Use a database
If you stored every link you wanted to redirect to in  a database you could simply store a unique key for each link – so you might record that key 2718 means http://cnn.com/WORLD/. Then the redirection URL might look like http://click.example.com/123/456/789/2718
This lets you do good reporting and is much more difficult for spammers to abuse (but not impossible – if the spammer signs up for a free or demo account on your system, then sends a test email to themselves, they can then reuse the links that they received in that mail).
But it’s fairly opaque to the recipient – they have no idea where the link will go. And it requires maintaining a database of every link you’ve ever used, for as long as it’s valuable (which could easily be several years if a recipient goes back to an old newsletter) and requires a database lookup for every click – which adds a fair bit of infrastructure you need to keep working 24/7 just to make links work.
Use a database and a cosmetic link
You could take the database format and add the final destination link on the end – like this http://click.example.com/123/456/789/2718/cnn.com/WORLD/ – and then just ignore everything after the url key (2718). That’ll work exactly the same way, but the final destination will be fairly transparent to the recipient.
This still can’t be abused by spammers, as if they try to use http://click.example.com/123/456/789/2718/mypharmacy.ru, it’ll still just redirect to http://cnn.com/WORLD/ as the only meaningful bit of the redirection link is the “2718“.
Cryptographically sign your links
A different approach is to record all the information you need in the link and to also add a cryptographic signature to prevent people from misusing it. This is much simpler than the word “cryptography” suggests, you just need to use a magic word (we’ll use “albatross”) and know about the md5() function.
You start off with the same destination string we used in Add some metadata – “/123/456/789/cnn.com/WORLD/“. Then you add the magic word on the end, to give “/123/456/789/cnn.com/WORLD/albatross“, and take the md5 “hash” of that. That’s some cryptographic black magic that’ll give you a string of letters and numbers that’s a “fingerprint” of that string. It’ll look something like “609a78b941bdf9f045cadcfa2e09d54c“. Then you combine that with the destination string to look like this:
http://click.example.com/609a78b941bdf9f045cadcfa2e09d54c/123/456/789/cnn.com/WORLD/
Then, when your webserver sees this link it splits it into the hash (609a78b941bdf9f045cadcfa2e09d54c) and destination string (/123/456/789/cnn.com/WORLD/). It then does exactly the same thing you did when you created the link – appends the magic word to the destination string to give “/123/456/789/cnn.com/WORLD/albatross” and takes the md5 hash of that string. If the result of that matches the hash in the link, it knows it’s a valid redirection link and it can record the click-tracking data and forward to the destination link. If the result doesn’t match it knows that the link has been tampered with, and can return an error page.
To generate the link in PHP would be something like this:

$destination = "/$customerid/$campaignid/$recipientid/$link";
$clicktrack = 'http://click.example.com/' . md5($destination . 'albatross') . $destination;

This is much cheaper to generate and validate than using a database, even a typical in-memory database.
Which to use?
Don’t use the simple approach – it’ll get abuse sooner or later and you’ll regret it. Any of the database or cryptographic approaches work just fine, though the cryptographic approach may be easier to scale up and maintain. The database approaches make it easier to disable a link, or direct it to somewhere else at a later point, in case of abuse or some other need.
What else is it good for?
You can use the same sort of approach to validate unsubscription links and VERP return paths for bounce handling. And “open tracking” using these sort of links for image URLs, if you find that a useful metric to offer.

Related Posts

Troubleshooting the simple stuff

I was talking with one of my Barry pals recently and was treated to a rant regarding deliverability experts that can’t manage simple things. We’ve been having an ongoing conversation recently about the utterly stupid and annoying questions some senders ask. Last week, I was ranting about a delivery person asking what “5.7.1. Too many receipts this session” meant. This morning I got an IM.

Read More

How to disable a domain

Sometimes you might want to make it clear that a domain isn’t valid for email.
Perhaps it’s a domain or subdomain that’s just used for infrastructure, perhaps it’s a brand-specific domain you’re only using for a website. Or perhaps you’re a target for phishing and you’ve acquired some lookalike domains, either pre-emptively or after enforcement action against a phisher, and you want to make clear that the domain isn’t legitimate for email.
There are several things to check before disabling email.
1. Are you receiving email at the domain? Is anyone else?
Check the MX records for the domain, using “host -t mx example.com” from a unix commandline, or using an online DNS tool such as xnnd.com.
If they’re pointing at a mailserver you control, check to see where that mail goes. Has anything been sent there recently?
If they’re pointing at a mailserver that isn’t yours, try and find out why.
If there are no MX records, but there is an A record for the domain then mail will be delivered there instead. Check whether that machine receives email for the domain and, if so, what it does with it.
Try sending mail to postmaster@ the domain, for instance postmaster@example.com. If you don’t get a bounce within a few minutes then that mail may be being delivered somewhere.
2. Are you sending email from the domain? Is anyone else?
You’re more likely to know whether you’re sending mail using the domain, but there’s a special case that many people forget. If there’s a server that has as it’s hostname the domain you’re trying to shut down then any system software running no that server – monitoring software, security alerts, output from cron and so on – is probably using that hostname to send mail. If so, fix that before you go any further.
3. Will you need mail sent to that domain for retrieving passwords?
If there are any services that might have been set up using an email address at the domain then you might need a working email address there to retrieve lost passwords. Having to set email back up for the domain in the future to recover a password is time consuming and annoying.
The domain registration for the domain itself is a common case, but if there’s any dns or web hosting being used for the domain, check the contact information being used there.
4. How will people contact you about the domain?
Even if you’re not using the domain for email it’s quite possible that someone may need to contact you about the domain, and odds are good they’ll want to use email. Make sure that the domain registration includes valid contact information that identifies you as the owner and allows people to contact you easily.
If you’re hosting web content using the domain, make sure there’s some way to contact you listed there. If you’re not, consider putting a minimal webpage there explaining the ownership, with a link to your main corporate website.
5. Disabling email
The easiest way to disable email for a domain is to add three DNS records for the domain. In bind format, they look like:

Read More

Abuse Reporting Format

J.D. has a great post digging into ARF, the abuse reporting format used by most feedback loops.
If you’re interested in following along, you might find this annotated example ARF report handy.

Read More