Filter complexity

URLBlockingForBlogDuring the Q&A last week, I mentioned an example of a type of filter trying to demonstrate how complex the filters are. There was some confusion about what I was saying, so I thought I’d write a blog post explaining this.

Background

This story came from another deliverability person, let’s call her ESPer. One of their customers (Customers) is using a 3rd party service that provides tracking links (Tracker). Tracker sent email to their customers saying that mails with more than 3 links were getting blocked.

It has come to our attention that Google has recently started flagging emails with multiple tracked links as suspicious or malicious. For example, if you have an email with more than 3 links (including any in your signature) and have Tracker link tracking turned on, recipients who use Gmail may see your message flagged with a warning. If your email contains 3 or fewer tracked links then you will be unaffected by this issue.

This triggered some Customers to call the ESP and start asking if Google was blocking mail with 3 or more links.

The Investigation

Multiple ESP folks checked their systems and found no correlation between multiple links in an email and bulk foldering at Gmail. I checked my Gmail account and a number of emails in my inbox have 4 or 5 or 6 links in them. None with the Tracker tracking cookie, though.
In an effort to test this a little more, I tried to sign up for a free account with the Tracker to do a little more checking. Tracker is used through an add on for use in Firefox, but it’s unsigned so I decided not to install it. It’s probably not malware, but if they can’t be bothered to sign their Add-on, I’m not going to risk installing it on my machine, even for my readers.

What we know

  1. Gmail is blocking mail with 3 or more links with one that is a Tracker link.
  2. Remove the Tracker link then mail goes to the inbox.
  3. Send with less than 3 links and a Tracker link then mail goes to the inbox.

What we speculate

One of the customer of Tracker is sending spam with 3 or more links plus the tracking links. Google has identified this mail as a problem and is blocking mail that has the same characteristics.
Removing the Tracker link should get the mail into the inbox.
Removing links so there are less than 3 links should get the mail to the inbox.

What this tells us

Filtering is complex. Like Really Really Complex. It’s not the presence of the tracking URL, it’s the presence of the tracking URL and 3 other URLs. Generally when we here at Word to the Wise try and test “what’s wrong” we’ll start removing URLs to see if one particular URL is causing a problem. In this case, that testing would have led us to an erroneous conclusion. We might find one URL “responsible” but only because we’d lowered the total number of URLs under 3.
I’ve been telling people and clients that filters are complex. More than 3 URLs + a specific URL is something that people wouldn’t normally identify as a filter criteria. But the neural net / machine learning / AI filters in use at Gmail noticed that mail with a particular number of links plus the Tracker link aren’t wanted by the recipients. The filters then started blocking mail selectively based on those criteria.
Filters aren’t magic, but sometimes the complexity makes them seem like it.
 
 
 

Related Posts

Do system administrators have too much power?

Yesterday, Laura brought a thread from last week to my attention, and the old-school ISP admin and mail geek in me felt the need to jump up and say something in response to Paul’s comment. My text here is all my own, and is based upon personal experience as well as those of my friends. That said, I’m not speaking on their behalf, either. 🙂
I found Paul’s use of the word ‘SysAdmin’ to be a mighty wide (and — in my experience — probably incorrect) brush to be painting with, particularly when referring to operations at ISPs with any significant number of mailboxes. My fundamental opposition to use of the term comes down to this: It’s no longer 1998.
The sort of rogue (or perhaps ‘maverick’) behavior to which you refer absolutely used to be a thing, back when a clean 56k dial-up connection was the stuff of dreams and any ISP that had gone through the trouble to figure out how to get past the 64k user limit in the UNIX password file was considered both large and technically competent. Outside of a few edge cases, I don’t know many system administrators these days who are able to (whether by policy or by access controls) — much less want to — make such unilateral deliverability decisions.
While specialization may be for insects, it’s also inevitable whenever a system grows past a certain point. When I started in the field, there were entire ISPs that were one-man shows (at least on the technical side). This simply doesn’t scale. Eventually, you start breaking things up into departments, then into services, then teams assigned to services, then parts of services assigned to teams, and back up the other side of the mountain, until you end up with a whole department whose job it is to run one component of one service.
For instance, let’s take inbound (just inbound) email. It’s not uncommon for a large ISP to have several technical teams responsible for the processing of mail being sent to their users:

Read More

Delivering to Gmail

Gmail is a challenge for even the best senders these days.
With the recent Gmail changes there isn’t any clear fix to getting open rates or inbox delivery back up. Some of it depends on what is causing Gmail to filter the mail. Changing subject lines, from name, from address may get mail back to the inbox in the short term, but it only works until the filters catch up.
What I am seeing, across a number of clients, is that Gmail is doing a lot of content reputation and that content reputation gets spread across senders of that content.  That means you want to look at who is sending any mail on your behalf (mentioning your domain or pointing at your website) and their practices. If they have poor practices, then it can reflect badly on you and result in filtering.
From what I’ve seen, these are very deliberate filtering decisions by Google. And it’s making mail a lot harder for many, many senders. But I think it is, unfortunately, the new reality.

Read More

Thoughts on Gmail filtering

Gmail has some extremely complex filters. They’re machine learning based and measure hundreds of things about incoming mail. The filters are continually adjusting to changes and updating how they treat specific mail.
One consequence of continually adjusting machine learning filters is that filtering is not static. What passes to the inbox now, may not pass in a couple hours.
One of the other challenges with Gmail filters is that they look at all the mail mentioning a particular domain and so affiliate mail and 3rd party mail can affect delivery of corporate mail.
The good news is that continually adjusting filters adapt to positive changes as well as negative ones. In fact, I recently made a segmentation suggestion to a client and they saw a significant increase in inbox delivery at Gmail the next day.
Gmail can be a challenge for delivery, but send mail users want and mail does go to the inbox.

Read More