DNS Flag Day

There are quite a lot of broken DNS servers out there. I’m sure that’s no surprise to you, but some of them might be yours. And you might not notice that until your domains stop working early next year.

DNS is quite an old protocol, and when it was originally specified there wasn’t really a good way to extend the protocol to add new features. That was fixed about 19 years ago when Extension Mechanisms for DNS (EDNS0) was specified, and solidly standardized in RFC 6891 in 2013. It added a backwards compatible way for a DNS client to ask “Hey! Do you support new features?” and for servers to include as part of their response “Yes! Yes I do!”.

That’s incredibly useful, and critical for extending the DNS to support new features (such as DNSSEC, or support for larger replies). And yet some authoritative DNS servers not only don’t support it, they misbehave when they’re asked if they support it. It’s been the case forever that DNS servers should just ignore (some sorts of) fields in requests if they don’t understand them. So when you send a request that includes an EDNS0 “Do you support new features?” field to a DNS server that doesn’t understand EDNS0 it should return a regular DNS response. Some (broken) nameservers don’t do that – instead they drop the request on the floor and don’t respond (or, even worse, crash). Eventually the recursive resolver will give up on the request.

(DNS servers broken in this way aren’t that rare in 2018 – just last week I had to add code to a DNS library I use so that it didn’t crash when it saw EDNS0 requests.)

Right now most recursive resolvers will see a timeout for a request that included EDNS0 and decide “Maybe it only failed because the remote server has buggy EDNS0 handling”. They’ll retry the request without EDNS0 and get an answer. This workaround means that the DNS will resolve eventually, after five or ten seconds of delay. Not good, but the web page will open or the mail will be delivered eventually.

But it’s a horrible workaround, and the developers of the most widely used recursive resolvers are done with this silliness. As of February 1st next year they’re not going to do it any more. If your DNS server is broken with respect to EDNS0 your hostnames won’t resolve via a large fraction of recursive resolvers. Your webpages won’t load, mail you send won’t have any SPF, DKIM or DMARC information or even any reverse DNS. Lots of things will break in a very visible way.

You can check whether your DNS server is broken or not, and get a bunch more technical details at dnsflagday.net.

Related Posts

Are they using DKIM?

It’s easy to tell if a domain is using SPF – look up the TXT record for the domain and see if any of them begin with “v=spf1”. If one does, they’re using SPF. If none do, they’re not. (If more than one does? They’re publishing invalid SPF.)
AOL are publishing SPF. Geocities aren’t.
For DKIM it’s harder, as a DKIM key isn’t published at a well-known place in DNS. Instead, each signed email includes a “selector” and you look up a record by combining that selector with the fixed string “._domainkey.” and the domain.
If you have DKIM-signed mail from them then you can find the selector (s=) in the DKIM-Signature header and look up the key. For example, Amazon are using a selector of “taugkdi5ljtmsua4uibbmo5mda3r2q3v”, so I can look up TXT records for “taugkdi5ljtmsua4uibbmo5mda3r2q3v._domainkey.amazon.com“, see that there’s a TXT record returned and know there’s a DKIM key.
That’s a particularly obscure selector, probably one they’re using to track DKIM lookups to the user the mail was sent to, but even if a company is using a selector like “jun2016” you’re unlikely to be able to guess it.
But there’s a detail in the DNS spec that says that if a hostname exists, meaning it’s in DNS, then all the hostnames “above” it in the DNS tree also exist (even if there are no DNS records for them). So if anything,_domainkey.example.com exists in DNS, so does _domainkey.example.com. And, conversely, if _domainkey.example.com doesn’t exist, no subdomain of it exists either.
What does it mean for a hostname to exist in DNS? That’s defined by the two most common responses you get to a DNS query.
One is “NOERROR” – it means that the hostname you asked about exists, even if there are no resource records returned for the particular record type you asked about.
The other is “NXDOMAIN” – it means that the hostname you asked about doesn’t exist, for any record type.
So if you look up _domainkey.aol.com you’ll see a “NOERROR” response, and know that AOL have published DKIM public keys and so are probably using DKIM.
(This is where Steve tries to find a domain that isn’t publishing DKIM keys … Ah! Al’s blog!)
If you look up _domainkey.spamresource.com you’ll see an “NXDOMAIN” response, so you know Al isn’t publishing any DKIM public keys, so isn’t sending any DKIM signed mail using that domain.
This isn’t 100% reliable, unfortunately. Some nameservers will (wrongly) return an NXDOMAIN even if there are subdomains, so you might sometimes get an NXDOMAIN even for a domain that is publishing DKIM. shrug
Sometimes you’ll see an actual TXT record in response – e.g. Yahoo or EBay – that’s detritus left over from the days of DomainKeys, a DomainKeys policy record, and it means nothing today.

Read More

Relaying Denied

I’ve got multiple clients right now looking for insights about bounce handling. This means I’m doing a lot of thought work about bounces and what they mean and how they match up and how different ISPs manage delivery and how different ESPs manage delivery and how it all fits together. One thing I’ve been trying to do is contextualize bounces based on what the reason is.
Despite what people may thing, spam filtering isn’t the only reason an email fails to deliver. There are lots of other reasons, too. There is a whole category of network problems like routing issues, TCP failures, DNS failures and such. There are address issues where a recipient simply doesn’t exist, or is blocking a particular sender. There are spam and authentication issues. The discussion of all these issues is way longer than a blog post, and I’m working on that.
One of the interesting bounces that is so rare most people, including me, never talk about is “Relaying Denied.” This is, however, one of the easier bounces to explain.
Relaying Denied means the mail server you’re talking to does not handle mail for the domain you’re sending to. 
Well, OK, but how does that happen?
There are a couple reasons you might get a “Relaying Denied” message, most of them having to do with a misconfiguration somewhere. For whatever reasons, the receiving server doesn’t handle mail for a domain.
DNS records are incorrect. These can be due to a number of things

Read More

The Internet is hard.

There are so many things that need to happen to make the Internet work. DNS entries need to be right. MXs need to be set up. Web servers need to be configured. And, let’s be honest, anyone who has ever run their own services on the Internet has flubbed a configuration.
We don’t think about it, because most of the time the configurations are handled by scripts and they do things right. But at some point someone needs to type in something and there’s a risk it will go horribly wrong. I’ve been digging into domain data for a client of mine today. I think I’m going cross-eyed over it. But I have found so many weird things that just mean someone isn’t paying attention to what they’re doing.
Like the domain that has a MX record that says:
nullmx
 
I’m pretty sure the intention of the domain owner is to publish a null MX. But they added an extraneous “0” in there and ended up publishing something really weird. Even worse, the MTA that this client is using is listing this as a “delivered” email. I’m pretty sure that mail to that domain never left the MTA.
I’ve found horribly typoed MX domains for popular spam filters. I’ve found domains that have invalid characters in them. I’ve found domains that are totally a mess.
The vast majority of us have some story or other of the time we really broke things by accident. Like the time a very large ISP deleted their MX records. Or when a different ISP changed their internal forwarding and broke SPF authentication for everyone mailing that domain. Or when another ISP accidentally blocked every IP beginning with 6.
Sometimes I’m amazed that the Internet ever works. No matter how big it gets, there are actual people writing actual code and configurations. The number of things that have to happen to get packets from A to B is pretty impressive. We rarely ever notice the breakages, the people who run things are really good at their jobs. But sometimes poking in the grotty corners reminds me how easy it is to break things. It’s sometimes a wonder things actually work.
 

Read More