Dodgy PDF handling at Gmail

We sent out some W-9s this week. For non-Americans and those lucky enough not to have to deal with IRS paperwork those are tax forms.
They’re simple single page forms with the company name, address and tax ID numbers on them. Because this is the 21st Century we don’t fill them in with typewriters and snail mail them out, we fill in a form online at the IRS website which gives us PDFs to download that we then send out via email.

We started to get replies from people we’d sent them to that we hadn’t included the tax ID number. Which was odd, because it was definitely there in the PDFs we’d sent.
The reports of missing numbers came from Google Apps users, so we sent a copy to one of our Gmail addresses to see. Sure enough, when you click on the attachment it’s mostly there, but some of the digits of the tax ID number are missing.

And all the spaces have been stripped from our address.

The rest of the form looked fine, but the information we’d entered was scrambled. Downloading the PDF from Gmail and displaying it – everything is there, and in the right place.
Weird. After a brief “Are gmail hiding things that look like social security numbers?” detour I realized that the IRS website was probably generating the customized forms using PDF annotations.
PDF is a very powerful, but very complex, file format. It’s not just an image, it’s a combination of different elements – images, lines, vector artwork, text, interactive forms, all sorts of things – bundled together into a single file. And you can add elements to an existing PDF file to, for example, overlay text on to it. These “annotations” are a common way to fill in a PDF form, by adding text in the right place over the top of an existing template PDF.
I cracked the PDF open with some forensics tools and sure enough, the IRS had generated the PDF form using annotations.
 

<< /Type /Annot /DV (Palo Alto, CA) /T (topmostSubform[0].Page1[0].Address[0].f1_8[0])
/Rect [ 57.6 539.968 388.8 553.969 ] /AP 81 0 R /FT /Tx /DA (/Helvetica-Bold 9 Tf 0 g)

And the Gmail PDF viewer isn’t rendering that annotated text correctly.
I’ve filed a bug sent feedback to Google, so hopefully it’ll be fixed. Meanwhile, if you’re sending customized content to recipients using PDF you should probably check that it renders correctly when previewed in Gmail.
 

Related Posts

Tell us about how you use Gmail Postmaster Tools

One of the things I hear frequently is that folks really want access to Google Postmaster Tools through an API. I’ve also heard some suggestions that we should start a petition. I thought a better idea was to put together a survey showing how people are using GPT and how high the demand is for an API.
They’re a data company, let’s give them data.

I’ve put together a survey looking at how people are using GPT. It’s 4 pages and average time to take the survey is around 7 minutes. Please give us your feedback on GPT usage.
I’m planning on leaving the survey open through the first week in November. Then I’ll pull data together and share here and with Google.

Read More

Gmail survey rough analysis

I closed the Google Postmaster Tools (GPT) survey earlier today. I received 160 responses, mostly from the link published here on the blog and in the M3AAWG Senders group.
I’ll be putting a full analysis together over the next couple weeks, but thought I’d give everyone a quick preview / data dump based on the analysis and graphs SurveyMonkey makes available in their analysis.
Of 160 respondents, 154 are currently using GPT. Some of the folks who said they didn’t have a GPT account also said they logged into it at least once a day, so clearly I have some data cleanup to do.
57% of respondents monitored customer domains. 79% monitored their own domains.
45% of respondents logged in at least once a day to check. Around 40% of respondents check IP and/or domain reputation daily. Around 25% of respondents use the authentication, encryption and delivery errors pages for troubleshooting.
10% said the pages were very easy to understand. 46% said they’re “somewhat easy” to understand.
The improvements suggestions are text based, but SurveyMonkey helpfully puts them together into a word cloud. It’s about what I expected. But I’ll dig into that data. 
10% of respondents said they had built tools to scrape the page. 50% said they hadn’t but would like to.
In terms of the problems they have with the 82% of people said they want to be able to create alerts, 60% said they want to add the data to dashboards or reporting tools.

97% of respondents who currently have a Google Postmater Tools account said they are interested in an API for the data. I’m sure the 4 who aren’t interested won’t care if there is one.
47% of respondents said if there was an API they’d have tools using it by the end of 2017. 73% said they’d have tools built by end of Q1 2018.
33% of respondents send more than 10 million emails per day.
75% of respondents work for private companies.
70% of respondents work for ESPs. 10% work for retailers or brands sending through their own infrastructure.
That’s my initial pass through the data. I’ll put together something a bit more coherent and some more useful analysis in the coming week and publish it. I am already seeing some interesting correlations I can do to get useful info out.
Thank you to everyone who participated! This is interesting data that I will be passing along to Google. Rough mental calculation indicates that respondents are responsible for multiple billions of emails a day.
Thanks!

Read More

How long does it take to change reputation at Gmail?

Today I was chatting with a potential client who is in the middle of a frustrating warmup at Gmail. They’re doing absolutely the right things, it’s just taking longer than anyone wants. That’s kinda how it is with Gmail, while their algorithm can adapt quickly to changes. Sometimes, like when you’re warming up or trying to change a bad reputation, it can take 3 – 4 weeks to see any direct progress.This is a screenshot of IP reputation on Google Postmaster Tools. The sender made some significant changes in mail sending on some of their IP addresses starting in mid to late December. You can see, that the tools noticed and the reputation of those IPs bad to good fairly rapidly. It took a few more weeks of consistent sending for those two IPs to switch to yellow. And it took around another month for the reputation to flip to high.
Because this company is doing all the right things, and they’re seeing (as they describe it) some small amounts of improvement, I told them to give it another couple weeks. If they weren’t happy with their progress I could help them. But, frankly, until we can tell if this is something other than a normal warmup there isn’t much else to do.
When I got off the phone I felt very much like a doctor telling a patient to take two aspirin and call me in the morning. But, honestly, sometimes that is the right answer. Give it time.

Read More