Same MX, different filters

One of the things I do for clients is look at who is really handling mail for their subscribers. Steve’s written a nifty tool that does a MX lookup for a list of domains. Then I have a SQL script that takes the raw MX lookup and categorizes not by the domain or even the MX, but by the underlying mail filter.

Part of that script classifies domains hosted by Google apps as a separate filter from Gmail. Even though they’re actually all the same underlying system. I never had any real, definitive evidence that the filters were different. Just a lot of indirect evidence seeing mail delivered.

That changed today as I was checking delivery for a client. One of their mailstreams is getting 100% inboxing at Gmail, but 100% spam at Google Apps. That’s pretty clear evidence that Google Apps and Gmail are different filters.

image of inbox monitoring showing the same message going 100% inbox at Gmail and 100% spam at Google Apps

I started looking at that mail in particular. Initially I noticed a feature of the subject line that looked like it may be something a business filter would trigger on. But, on looking deeper, there are other features that make it clear this is a different mail stream. What isn’t different is the From domain, the SPF domain or the DKIM signature.

In any case, this particular pattern makes it pretty clear that Google is specifically depositing this mail stream in the bulk folder of Google Apps users. Meanwhile the messages are going to the inbox at Gmail and all the other messages from this sender are going to the inbox at both places.

Google filters are specific and sensitive. They can identify different mail streams and target messages separately between Gmail and Google Apps.

Related Posts

Gmail survey rough analysis

I closed the Google Postmaster Tools (GPT) survey earlier today. I received 160 responses, mostly from the link published here on the blog and in the M3AAWG Senders group.
I’ll be putting a full analysis together over the next couple weeks, but thought I’d give everyone a quick preview / data dump based on the analysis and graphs SurveyMonkey makes available in their analysis.
Of 160 respondents, 154 are currently using GPT. Some of the folks who said they didn’t have a GPT account also said they logged into it at least once a day, so clearly I have some data cleanup to do.
57% of respondents monitored customer domains. 79% monitored their own domains.
45% of respondents logged in at least once a day to check. Around 40% of respondents check IP and/or domain reputation daily. Around 25% of respondents use the authentication, encryption and delivery errors pages for troubleshooting.
10% said the pages were very easy to understand. 46% said they’re “somewhat easy” to understand.
The improvements suggestions are text based, but SurveyMonkey helpfully puts them together into a word cloud. It’s about what I expected. But I’ll dig into that data. 
10% of respondents said they had built tools to scrape the page. 50% said they hadn’t but would like to.
In terms of the problems they have with the 82% of people said they want to be able to create alerts, 60% said they want to add the data to dashboards or reporting tools.

97% of respondents who currently have a Google Postmater Tools account said they are interested in an API for the data. I’m sure the 4 who aren’t interested won’t care if there is one.
47% of respondents said if there was an API they’d have tools using it by the end of 2017. 73% said they’d have tools built by end of Q1 2018.
33% of respondents send more than 10 million emails per day.
75% of respondents work for private companies.
70% of respondents work for ESPs. 10% work for retailers or brands sending through their own infrastructure.
That’s my initial pass through the data. I’ll put together something a bit more coherent and some more useful analysis in the coming week and publish it. I am already seeing some interesting correlations I can do to get useful info out.
Thank you to everyone who participated! This is interesting data that I will be passing along to Google. Rough mental calculation indicates that respondents are responsible for multiple billions of emails a day.
Thanks!

Read More

Gmail, machine learning, filters

I’m sure by now readers have seen the article from Gmail “Spam does not bring us joy — ridding Gmail of 100 million more spam messages with TensorFlow.” If you haven’t seen it, go read it. It’s not often companies write about their filtering philosophy and what tools they’re using to manage incoming bad mail.

Read More

What kind of mail do filters target?

All to often we think of filters as a linear scale. There’s blocking on one end, and there’s an inbox on the other. Every email falls somewhere on that line.
Makes sense, right? Bad mail is blocked, good mail goes to the inbox. The bulk folder exists for mail that’s not bad enough to block, but isn’t good enough to go to the inbox.
Once we get to that model, we can think of filters as just different tolerances for what is bad and good. Using the same model, we can see aggressive filters block more mail and send more mail to bulk, while letting less into the inbox. There are also permissive filters that block very little mail and send most mail to the inbox.
That’s a somewhat useful model, but it doesn’t really capture the full complexity of filters. There isn’t just good mail and bad mail. Mail isn’t simply solicited or unsolicited. Filters take into account any number of factors before deciding what to do with mail.

Read More