Who leaked my address, and when?

steve
July 7, 2011
Industry , Technical

Providing tagged email addresses to vendors is fascinating, and at the same time disturbing. It lets me track what a particular email address is used for, but also to see where and when they’ve leaked to spammers.
I’d really like to know who leaked an email address, and when.
All my inbound mail is sorted into “spam” and “not-spam” by a combination of SpamAssassin, some static sieve rules and a learning spam filter in my mail client. That makes it fairly easy for me to look at my “recent spam”. That’s a huge amount of data, though, something like 40,000 pieces of spam a month.
Finding the needle of interesting data in that haystack is going to take some automation. As I’ve mentioned before you can do quite a lot of useful work with a mix of some little perl scripts and some commandline tools.
I’m interested in the first time a tagged address started receiving spam, so I start off with a perl script that will take a directory full of emails, one per file, find the ones that were sent to a tagged address and print out that address and the time I received the email. I can’t rely on the Date: header, as that’s under the control of the spammer, and often bogus. But I can rely on the timestamp my server adds when it receives the email – and it records that in the first Received: header in the message.

#!/usr/bin/perl
use strict;
use Date::Parse;
foreach my $file (@ARGV) {
    open IF, $file or die "Failed to open '$file': $!n";
    my @headers;
    while() {
        s/[rn]//g;
        last if /^$/;
	push @headers, '' unless /^s/;
	$headers[$#headers] .= $_;
    }
    my $date;
    my $timestamp;
    foreach my $header (@headers) {
	if($header =~ /^Received:.*;([^(]+)/) {
	    $date = $1;
	    $timestamp = str2time($1);
	    last;
	}
    }
    # Replace this regex with something that
    # matches your tagged addresses
    if(join(' ', @headers) =~
                 /(foo+[a-z0-9]+@[a-z.-]+)/) {
	print "$timestamp $1 $daten";
    }
}

Dates and times are annoying to work with on the command line, so I also use the perl Date::Parse module to convert the timestamp in the received header into epoch time – the number of seconds since January 1st, 1970. I use some unix commandline magic to run this against my two spam mailboxes and dump the results in a file.

find spamassassin/ | xargs stamp-address.pl >>junk.txt
find junk/ | xargs stamp-address.pl >> junk.txt

The end result is one line per email, with the epoch time, the tagged email address and the original format of the date and time. Something like this:

1300731078 cpan-tag@addr  Mon, 21 Mar 2011 11:11:18 -0700
1300731122 vmware-tag@addr Mon, 21 Mar 2011 11:12:02 -0700
1300731122 vmware-tag@addr Mon, 21 Mar 2011 11:12:02 -0700
1300732902 unicorn-tag@addr Mon, 21 Mar 2011 11:41:42 -0700

Next, I want to find the first occurrence of each tagged address.

#!/usr/bin/perl
use strict;
my %seen;
while(<>) {
    chomp;
    my ($stamp, $address) = split / /;
    unless(exists $seen{$address}) {
	print "$_n";
	$seen{$address} = 1;
    }
}

I sort the list of addresses numerically, then use this script to display the first time each email address received spam:

sort -n
That reduces the amount of data enough that I can look at it by hand. What did I find? Several interesting things, but I'm just going to mention one here.
1299111914 casemate-tag@addr Wed, 2 Mar 2011 16:25:14 -0800
1307104954 dell-tag@addr Fri, 3 Jun 2011 05:42:34 -0700
1307104986 codefast-tag@addr Fri, 3 Jun 2011 05:43:06 -0700
Casemate and Codefast have only ever mailed me via iContact, so given iContact's history it seems likely that those leaks were via iContact.
Dell, on the other hand, have mailed me directly and through several ESPs - and I don't recall them using iContact. Looking at the timestamps (and the content of the spams) it's clear that the Dell and Codefast tagged addresses were both sent spam for the first time as part of the same spamrun - so it's almost certain that they leaked at the same time.
Looking for iContacts bounce domain (icpbounce.com) in my mailbox I do find that Dell used them briefly, on May 4th. So that's pretty compelling evidence that iContact leaked all three addresses. (Which means my previous theory about Dell customer addresses leaking, based on misleading statements from Intervision, was wrong.)
There's another thing that's interesting... iContact has had a history of email breaches. The data I have here (and it's matched by a couple of older data points, if I recall correctly) shows spam being sent to newly leaked addresses on the 2nd or 3rd of the month.
I wonder if iContact does a batch export to a subcontractor, or an offsite backup or something similar on the first of each month?

The weak link in security

Gathering data from PACER

Security framework document published

laura
Apr 26, 2011

Industry

The Online Trust Alliance has published a security framework for ESPs.
Overall, I think it’s a useful starting point. I don’t agree with all of their suggestions. Some of them are expensive and provide little increase in security. While others decrease security, like the suggestion to force regular password changes.
I think the most important part of the document is the question section. The key to effective security measures is understanding threats. Answering the self assessment questions and thinking about internal processes will help identify potential threats and their vectors.
The document is not a panacea, and even companies that implement all of their recommendations will still be open to attacks from other avenues. But it certainly is a very good way to open the security discussion.

steve
Apr 28, 2011

Industry

Two factor authentication, or the snappy acronym 2FA, is something that you’re going to be hearing a lot about over the next year or so, both for use by ESP employees (in an attempt to reduce the risks of data theft) and by ESP customers (attempting to reduce the chance of an account being misused to send spam). What is Authentication?
In computer security terms authentication is proving who you are – when you enter a username and a password to access your email account you’re authenticating yourself to the system using a password that only you know.
Authentication (“who you are”) is the most visible part of computer access control, but it’s usually combined with two other A’s – authorization (“what you are allowed to do”) and accounting (“who did what”) to form an access control system.
And what are the two factors?
Two factor authentication means using two independent sources of evidence to demonstrate who you are. The idea behind it is that it means an attacker need to steal two quite different bits of information, with different weaknesses and attack vectors, in order to gain access. This makes the attack scenario much more complex and difficult for an attacker to carry out.
It’s important that the different factors are independent – requiring two passwords doesn’t count as 2FA, as an attack that can get the first password can just as easily get the second password. Generally 2FA requires the user to demonstrate their identity via two out of three broad ways:

steve
Apr 6, 2011

Best Practices

After Epsilon lost a bunch of customer lists last week, I’ve been keeping an eye open to see if any of the vendors I work with had any of my email addresses stolen – not least because it’ll be interesting to see where this data ends up.
Yesterday I got mail from Marriott, telling me that “unauthorized third party gained access to a number of Epsilon’s accounts including Marriott’s email list.”. Great! Lets start looking for spam to my Marriott tagged address, or for phishing targeted at Marriott customers.
I hit what looks like paydirt this morning. Plausible looking mail with Marriott branding, nothing specific to me other than name and (tagged) email address.
It’s time to play Real. Or. Phish?
1. Branding and spelling is all good. It’s using decent stock photos, and what looks like a real Marriott logo.
All very easy to fake, but if it’s a phish it’s pretty well done. Then again, phishes often steal real content and just change out the links.
Conclusion? Real. Maybe.
2. The mail wasn’t sent from marriott.com, or any domain related to it. Instead, it came from “Marriott@marriott-email.com”.
This is classic phish behaviour – using a lookalike domain such as “paypal-billing.com” or “aolsecurity.com” so as to look as though you’re associated with a company, yet to be able to use a domain name you have full control of, so as to be able to host websites, receive email, sign with DKIM, all that sort of thing.
Conclusion? Phish.
3. SPF pass
Given that the mail was sent “from” marriott-email.com, and not from marriott.com, this is pretty meaningless. But it did pass an SPF check.
Conclusion? Neutral.
4. DKIM fail
Authentication-Results: m.wordtothewise.com; dkim=fail (verification failed; insecure key) header.i=@marriott-email.com;
As the mail was sent “from” marriott-email.com it should have been possible for the owner of that domain (presumably the phisher) to sign it with DKIM. That they didn’t isn’t a good sign at all.
Conclusion? Phish.
5. Badly obfuscated headers
From: =?iso-8859-1?B?TWFycmlvdHQgUmV3YXJkcw==?= <Marriott@marriott-email.com> Subject: =?iso-8859-1?B?WW91ciBBY2NvdW50IJYgVXAgdG8gJDEwMCBjb3Vwb24=?=
Base 64 encoding of headers is an old spammer trick used to make them more difficult for naive spam filters to handle. That doesn’t work well with more modern spam filters, but spammers and phishers still tend to do it so as to make it harder for abuse desks to read the content of phishes forwarded to them with complaints. There’s no legitimate reason to encode plain ascii fields in this way. Spamassassin didn’t like the message because of this.
Conclusion? Phish.
6. Well-crafted multipart/alternative mail, with valid, well-encoded (quoted-printable) plain text and html parts
Just like the branding and spelling, this is very well done for a phish. But again, it’s commonly something that’s stolen from legitimate email and modified slightly.
Conclusion? Real, probably.
7. Typical content links in the email
Most of the content links in the email are to things like “http://marriott-email.com/16433acf1layfousiaey2oniaaaaaalfqkc4qmz76deyaaaaa”, which is consistent with the from address, at least. This isn’t the sort of URL a real company website tends to use, but it’s not that unusual for click tracking software to do something like this.
Conclusion? Neutral
8. Atypical content links in the email
We also have other links:

Who leaked my address, and when?

Related Posts

Security framework document published

What is Two Factor Authentication?

Real. Or. Phish?

Who leaked my address, and when?

Share :

Related Posts

Security framework document published

What is Two Factor Authentication?

Real. Or. Phish?