Filter complexity

URLBlockingForBlogDuring the Q&A last week, I mentioned an example of a type of filter trying to demonstrate how complex the filters are. There was some confusion about what I was saying, so I thought I’d write a blog post explaining this.

Background

This story came from another deliverability person, let’s call her ESPer. One of their customers (Customers) is using a 3rd party service that provides tracking links (Tracker). Tracker sent email to their customers saying that mails with more than 3 links were getting blocked.

It has come to our attention that Google has recently started flagging emails with multiple tracked links as suspicious or malicious. For example, if you have an email with more than 3 links (including any in your signature) and have Tracker link tracking turned on, recipients who use Gmail may see your message flagged with a warning. If your email contains 3 or fewer tracked links then you will be unaffected by this issue.

This triggered some Customers to call the ESP and start asking if Google was blocking mail with 3 or more links.

The Investigation

Multiple ESP folks checked their systems and found no correlation between multiple links in an email and bulk foldering at Gmail. I checked my Gmail account and a number of emails in my inbox have 4 or 5 or 6 links in them. None with the Tracker tracking cookie, though.
In an effort to test this a little more, I tried to sign up for a free account with the Tracker to do a little more checking. Tracker is used through an add on for use in Firefox, but it’s unsigned so I decided not to install it. It’s probably not malware, but if they can’t be bothered to sign their Add-on, I’m not going to risk installing it on my machine, even for my readers.

What we know

  1. Gmail is blocking mail with 3 or more links with one that is a Tracker link.
  2. Remove the Tracker link then mail goes to the inbox.
  3. Send with less than 3 links and a Tracker link then mail goes to the inbox.

What we speculate

One of the customer of Tracker is sending spam with 3 or more links plus the tracking links. Google has identified this mail as a problem and is blocking mail that has the same characteristics.
Removing the Tracker link should get the mail into the inbox.
Removing links so there are less than 3 links should get the mail to the inbox.

What this tells us

Filtering is complex. Like Really Really Complex. It’s not the presence of the tracking URL, it’s the presence of the tracking URL and 3 other URLs. Generally when we here at Word to the Wise try and test “what’s wrong” we’ll start removing URLs to see if one particular URL is causing a problem. In this case, that testing would have led us to an erroneous conclusion. We might find one URL “responsible” but only because we’d lowered the total number of URLs under 3.
I’ve been telling people and clients that filters are complex. More than 3 URLs + a specific URL is something that people wouldn’t normally identify as a filter criteria. But the neural net / machine learning / AI filters in use at Gmail noticed that mail with a particular number of links plus the Tracker link aren’t wanted by the recipients. The filters then started blocking mail selectively based on those criteria.
Filters aren’t magic, but sometimes the complexity makes them seem like it.
 
 
 

Related Posts

Thanks for the great session

I had a great time answering questions at the 2015 All About eMail Virtual Conference & Expo today. Thanks so much to everyone who participated and asked questions. They were great and I’m sorry we didn’t have more time.
I did get some questions on twitter (@wise_laura) afterwards. One was about an example I gave to explain how filters are complex. There have been rumors going around recently that Gmail is filtering mail with more than 3 URLs in it. Let me just say right now THIS IS NOT TRUE emails with more than 3 URLs in them are being delivered just fine to Gmail.
There is a situation involving the number (and type) of URLs that I think are a useful example of the filter complexity happening at some places, like Gmail. I started working on it, but don’t quite have time to finish it today, but will keep working on and it should go up in the next day or so.
Thanks again to everyone who joined the session. You asked some great questions and I had fun answering them.
 

Read More

Do system administrators have too much power?

Yesterday, Laura brought a thread from last week to my attention, and the old-school ISP admin and mail geek in me felt the need to jump up and say something in response to Paul’s comment. My text here is all my own, and is based upon personal experience as well as those of my friends. That said, I’m not speaking on their behalf, either. 🙂
I found Paul’s use of the word ‘SysAdmin’ to be a mighty wide (and — in my experience — probably incorrect) brush to be painting with, particularly when referring to operations at ISPs with any significant number of mailboxes. My fundamental opposition to use of the term comes down to this: It’s no longer 1998.
The sort of rogue (or perhaps ‘maverick’) behavior to which you refer absolutely used to be a thing, back when a clean 56k dial-up connection was the stuff of dreams and any ISP that had gone through the trouble to figure out how to get past the 64k user limit in the UNIX password file was considered both large and technically competent. Outside of a few edge cases, I don’t know many system administrators these days who are able to (whether by policy or by access controls) — much less want to — make such unilateral deliverability decisions.
While specialization may be for insects, it’s also inevitable whenever a system grows past a certain point. When I started in the field, there were entire ISPs that were one-man shows (at least on the technical side). This simply doesn’t scale. Eventually, you start breaking things up into departments, then into services, then teams assigned to services, then parts of services assigned to teams, and back up the other side of the mountain, until you end up with a whole department whose job it is to run one component of one service.
For instance, let’s take inbound (just inbound) email. It’s not uncommon for a large ISP to have several technical teams responsible for the processing of mail being sent to their users:

Read More

SPF debugging

Someone mentioned on a mailing list that mail “from” intuit.com was being filed in the gmail spam folder, with the warning “Our systems couldn’t verify that this message was really sent by intuit.com“. That warning means that Gmail thinks it may be phishing mail. Given they’re a well-known financial services organization, I’m sure there is a lot of phishing mail claiming to be from them.
But I’d expect that a company the size of Intuit would be authenticating their mail, and that Gmail should be able to use that authentication to know that the mail wasn’t a phish.
Clearly something is broken somewhere. Lets take a look.
Looking at the headers, the mail was being sent from Salesforce, and (despite Salesforce offering DKIM) it wasn’t DKIM signed by anyone. So … look at SPF.
SPF passes:

Read More