Dueling data

One of the things I miss about being in science is the regular discussions (sometimes heated) about data and experimental results. To be fair, I get some of that when talking about email stuff with Steve. We each have some strong view points and aren’t afraid to share them with each other and with other people. In fact, one of the things we hear most when meeting folks for the first time is, “I love it when you two disagree with each other on that mailing list!” Both of us have engineering and science backgrounds, so we can argue in that vein.
ThatsFunny
One of the challenges of seemingly contradictory data is figuring out why it seems to disagree. Of course, in science the first step is always to look at your experimental design and data collection. Did I do the experiment right? (Do it again. Always do it again.) Did I record the data correctly? Is the design right? So what did I do differently from what you did? For instance, at one of my labs we discovered that mixing a reagent in plastic tubes created a different outcome from mixing the reagent in glass vials. So many variables that you don’t even think of being variables that affect the outcome of an experiment.

What’s that got to do with email?

In the email space we have lots of people sharing data. Some of it is data we like – that is data that confirms our perceptions. And some of it is data we don’t like – data that contradicts our perceptions.
Recently two different ESPs have published contradictory data about purging subscribers and removing recipients from your lists.  Mailchimp published Inactive subscribers are still valuable customers and Hubspot published What Happened to Our Metrics After We Stopped Sending So Much Email.
These two publications seem to be a bit contradictory. One is saying that inactive subscribers, subscribers who haven’t opened or clicked on emails in a while, are still valuable sources of revenue. The other is saying removing inactive subscribers increases email metrics like opens and clicks. So what’s really going on?

Different methods measure different things

Mailchimp looked at the revenue generated by inactive subscribers as compared to revenue from non-subscribers. They specifically looked at e-commerce senders mailing to previous purchasers.
Hubspot looked at various email metrics and how they changed when removing subscribers. They specifically looked at recipients to their own mailing list.
In many ways that’s the end of the story. The two studies look at different things. They looked at different populations. They measured different things. They are not comparable. They’re not even really contradictory due to the significant differences in the study population.

Well, that’s not very useful

Sorry. Research tells us answers, but doesn’t always give us clear and actionable answers.
The reality is, neither of these were designed experiments. Rather, they both describe observed behavior in “the wild” as it were. The research is much closer to epidemiology than any other branch of science. Epidemiology tells us what happens, but doesn’t necessarily tell us how to either make something happen or stop something from happening. Back when I was taking poultry pathology in grad school we did quite a bit of epidemiology and it’s HARD. For instance, one example we studied was an avian disease outbreak that seemed totally random. After months and months of work, research, interviews and study they finally figured out the infection was being carried on the car tires of a particular sales person. That’s how hard epidemiology is.
A lot of deliverability and email marketing is like epidemiology. We know what worked in the past, but sometimes we’re chasing a guy with a contagious disease on his tires.

No, really, what do you think about the data?

I think the data is right. And I do think we can take some lessons from it.

  • Hubspot did see increased email engagement with their subscribers when they stopped mailing quite so much.
  • Mailchimp customers did see actual revenue from their inactive subscribers.

Let’s rephrase what Mailchimp said they discovered: Inactive subscribers buy more than non-subscribers and don’t buy as much as active subscribers. That’s one of those things that my only response is, “Well, yes, we all kinda knew that but it’s nice someone did the work.”
Let’s rephrase what Hubspot said they discovered: If you send too much mail you wear out your receivers and they pay less attention. Again, we knew that.
But what didn’t they say?

  • Mailchimp didn’t mention delivery changes.
  • Hubspot didn’t mention revenue.

We don’t know whether Mailchimp saw deliverability differences. In the face of more revenue, it’s not really an issue but their delivery stats may have been worse.
We don’t know if Hubspot saw increased revenue (although we do have their 10-K that shows some revenue increase). But they’re not a commerce shop, they’re not directly selling through email. Their emails drive potential readers. Eventually the hope is (I’m assuming) the readers will convert, but the Hubspot emails are not the same as e-commerce email.

You didn’t answer the question.

I did, though. Both things are true.
If you are in e-commerce you’ll make revenue from your inactive subscribers; so you should prune them carefully.
If you are driving site engagement you’ll increase readership by removing inactive subscribers; so you can probably be more aggressive in pruning.

What’s right for my program?

It depends.
Is your program closer to the mail studied by Mailchimp? Or is your program closer to the mail studied by Hubspot?
We work with a lot of different kinds of senders and work with them to find the answer right for their business, their subscribers and their marketing program. Sometimes it means pruning, sometimes it doesn’t.  Contact us for more information on how we can help your program make sense of seemingly conflicting data.
 

Related Posts

Data Cleansing

According to Ken, Outward Media has productized a database of 300,000,000 email addresses that should never be mailed.

Read More

Data, data, elections and data

One of the interesting stories coming out of the recent US Presidential election is how much data the Obama Campaign collected about voters, volunteers and donors. Today Politico talks about how valuable that data is, and how many Democrats want to get their hands on it.

Read More

Data is the key to deliverability

Last week I had the pleasure of speaking to the Sendgrid Customer Advisory Board about email and deliverability. As usually happens when I give talks, I learned a bunch of new things that I’m now integrating into my mental model of email.
One thing that bubbled up to take over a lot of my thought processes is how important data collection and data maintenance is to deliverability. In fact, I’m reaching the conclusion that the vast majority of deliverability problems stem from data issues. How data is collected, how data is managed, how data is maintained all impact how well email is delivered.
Collecting Data
There are many pathways used to collect data for email: online purchases, in-store purchases, signups on websites, registration cards, trade shows, fishbowl drops, purchases, co-reg… the list goes on and on. In today’s world there is a big push to make data collection as frictionless as possible. Making collection processes frictionless (or low friction) often means limiting data checking and correction. In email this can result in mail going to people who never signed up. Filters are actually really good at identifying mail streams going to the wrong people.
The end result of poor data collection processes is poor delivery.
There are lots of way to collect data that incorporates some level of data checking and verifying the customer’s identity. There are ways to do this without adding any friction, even. About 8 years ago I was working with a major retailer that was dealing with a SBL listing due to bad addresses in their store signup program. What they ended up implementing was tagged coupons emailed to the user. When the user went to the store to redeem the coupons, the email address was confirmed as associated with the account. We took what the customers were doing anyway, and turned it into a way to do closed loop confirmation of their email address.
Managing Data
Data management is a major challenge for lots of senders. Data gets pulled out of the database of record and then put into silos for different marketing efforts. If the data flow isn’t managed well, the different streams can have different bounce or activity data. In a worst case scenario, bad addressees like spamtraps, can be reactivated and lead to blocking.
This isn’t theoretical. Last year I worked with a major political group that was dealing with a SBL issue directly related to poor data management. Multiple databases were used to store data and there was no central database. Because of this, unsubscribed and inactivated addresses were reactivated. This included a set of data that was inactivated to deal with a previous SBL listing. Eventually, spamtraps were mailed again and they were blocked. Working with the client data team, we clarified and improved the data flow so that inactive addresses could not get accidentally or unknowingly reactivated.
Maintaining Data
A dozen years ago few companies needed to think about any data maintenance processes other than “it bounces and we remove it.” Most mailbox accounts were tied into dialup or broadband accounts. Accounts lasted until the user stopped paying and then mail started bouncing. Additionally, mailbox accounts often had small limits on how much data they could hold. My first ISP account was limited to 10MB, and that included anything I published on my website. I would archive mail monthly to keep mail from bouncing due to a full mailbox.
But that’s not how email works today. Many people have migrated to free webmail providers for email. This means they can create (and abandon) addresses at any time. Free webmail providers have their own rules for bouncing mail, but generally accounts last for months or even years after the user has stopped logging into them. With the advent of multi gigabyte storage limits, accounts almost never fill up.
These days, companies need to address what they’re going to do with data if there’s no interaction with the recipient in a certain time period. Otherwise, bad data just keeps accumulating and lowering deliverability.
Deliverability is all about the data. Good data collection and good data management and good data maintenance results in good email delivery. Doing the wrong thing with data leads to delivery problems.
 
 

Read More