DNS Flag Day

There are quite a lot of broken DNS servers out there. I’m sure that’s no surprise to you, but some of them might be yours. And you might not notice that until your domains stop working early next year.

DNS is quite an old protocol, and when it was originally specified there wasn’t really a good way to extend the protocol to add new features. That was fixed about 19 years ago when Extension Mechanisms for DNS (EDNS0) was specified, and solidly standardized in RFC 6891 in 2013. It added a backwards compatible way for a DNS client to ask “Hey! Do you support new features?” and for servers to include as part of their response “Yes! Yes I do!”.

That’s incredibly useful, and critical for extending the DNS to support new features (such as DNSSEC, or support for larger replies). And yet some authoritative DNS servers not only don’t support it, they misbehave when they’re asked if they support it. It’s been the case forever that DNS servers should just ignore (some sorts of) fields in requests if they don’t understand them. So when you send a request that includes an EDNS0 “Do you support new features?” field to a DNS server that doesn’t understand EDNS0 it should return a regular DNS response. Some (broken) nameservers don’t do that – instead they drop the request on the floor and don’t respond (or, even worse, crash). Eventually the recursive resolver will give up on the request.

(DNS servers broken in this way aren’t that rare in 2018 – just last week I had to add code to a DNS library I use so that it didn’t crash when it saw EDNS0 requests.)

Right now most recursive resolvers will see a timeout for a request that included EDNS0 and decide “Maybe it only failed because the remote server has buggy EDNS0 handling”. They’ll retry the request without EDNS0 and get an answer. This workaround means that the DNS will resolve eventually, after five or ten seconds of delay. Not good, but the web page will open or the mail will be delivered eventually.

But it’s a horrible workaround, and the developers of the most widely used recursive resolvers are done with this silliness. As of February 1st next year they’re not going to do it any more. If your DNS server is broken with respect to EDNS0 your hostnames won’t resolve via a large fraction of recursive resolvers. Your webpages won’t load, mail you send won’t have any SPF, DKIM or DMARC information or even any reverse DNS. Lots of things will break in a very visible way.

You can check whether your DNS server is broken or not, and get a bunch more technical details at dnsflagday.net.

Related Posts

DNSBLs, wildcards and domain expiration

Last week the megarbl.net domain name expired. Normally this would have no affect on anyone, but their domain registrar put in a wildcard DNS entry. Because of how DNSBLs work, this had the effect of causing every IP to be listed on the blocklist. The domain is now active and the listings due to the DNS wildcard are removed.

Read More

SPF: The rule of ten

Some mechanisms and modifiers (collectively, “terms”) cause DNS queries at the time of evaluation, and some do not. The following terms cause DNS queries: the “include”, “a”, “mx”, “ptr”, and “exists” mechanisms, and the “redirect” modifier. SPF implementations MUST limit the total number of those terms to 10 during SPF evaluation, to avoid unreasonable load on the DNS. If this limit is exceeded, the implementation MUST return “permerror”.

Read More

The Internet is hard.

There are so many things that need to happen to make the Internet work. DNS entries need to be right. MXs need to be set up. Web servers need to be configured. And, let’s be honest, anyone who has ever run their own services on the Internet has flubbed a configuration.
We don’t think about it, because most of the time the configurations are handled by scripts and they do things right. But at some point someone needs to type in something and there’s a risk it will go horribly wrong. I’ve been digging into domain data for a client of mine today. I think I’m going cross-eyed over it. But I have found so many weird things that just mean someone isn’t paying attention to what they’re doing.
Like the domain that has a MX record that says:
nullmx
 
I’m pretty sure the intention of the domain owner is to publish a null MX. But they added an extraneous “0” in there and ended up publishing something really weird. Even worse, the MTA that this client is using is listing this as a “delivered” email. I’m pretty sure that mail to that domain never left the MTA.
I’ve found horribly typoed MX domains for popular spam filters. I’ve found domains that have invalid characters in them. I’ve found domains that are totally a mess.
The vast majority of us have some story or other of the time we really broke things by accident. Like the time a very large ISP deleted their MX records. Or when a different ISP changed their internal forwarding and broke SPF authentication for everyone mailing that domain. Or when another ISP accidentally blocked every IP beginning with 6.
Sometimes I’m amazed that the Internet ever works. No matter how big it gets, there are actual people writing actual code and configurations. The number of things that have to happen to get packets from A to B is pretty impressive. We rarely ever notice the breakages, the people who run things are really good at their jobs. But sometimes poking in the grotty corners reminds me how easy it is to break things. It’s sometimes a wonder things actually work.
 

Read More