Clarification of why DNS zone files require NS records

Let's break it down a little.

The NS records in the TLD zone (for example, example.com NS ... in com) are delegation records.

The A and AAAA records in the TLD zone (for example, ns1.example.com A ... in com) are glue records.

The NS records in the zone itself (that is, example.com NS ... in example.com) are authority records.

The A and AAAA records in the zone itself (ns1.example.com A ... in example.com) are address records, plain and simple.

When a (recursive) resolver starts out with no cache of your zone's data and only the root zone cache (which is used to bootstrap the name resolution process), it will first go to ., then com.. The com servers will respond with an authority section response which basically says "I don't know, but look here for someone who does know", same as the servers for . do about com. This query response is not authoritative and does not include a populated answer section. It may also include a so-called additional section which gives the address mappings for any host names the particular server knows about (either from glue records or, in the case of recursive resolvers, from previously cached data). The resolver will take this delegation response, resolve the host name of a NS record if necessary, and proceed to query the DNS server to which authority has been delegated. This process may repeat a number of times if you have a deep delegation hierarchy, but eventually results in a query response with the "authoritative answer" flag set.

It's important to note that the resolver (generally, hopefully) won't try to break down the host name being resolved to ask about it piece by piece, but will simply send it in its entirety to the "best" server it knows about. Since the average authoritative name server on the Internet is non-authoritative for the vast majority of valid DNS names, the response will be a non-authoritative delegation response pointing toward some other DNS server.

Now, a server doesn't have to be named in the delegation or authority records anywhere to be authoritative for a zone. Consider for example the case of a private master server; in that case there exists an authoritative DNS server that only the administrator(s) of the slave DNS servers for the zone are aware of. A DNS server is authoritative for a zone if, through some mechanism, in its opinion it has full and accurate knowledge of the zone in question. A normally authoritative DNS server can, for example, become non-authoritative if the configured master server(s) cannot be reached within the time limit defined as the expiry time in the SOA record.

Only authoritative answers should be considered proper query responses; everything else is either a delegation, or an error of some kind. A delegation to a non-authoritative server is called a "lame" delegation, and means the resolver has to backtrack one step and try some other named DNS server. If no authoritative reachable name servers exist in the delegation, then name resolution fails (otherwise, it'll just be slower than normal).

This is all important because non-authoritative data mustn't be cached. How could it be, since the non-authoritative server doesn't have the full picture? So the authoritative server must, on its own accord, be able to answer the question "who is supposed to be authoritative, and for what?". That's the information provided by the in-zone NS records.

There's a number of edge cases where this can actually make a serious difference, primarily centered around multiple host name labels inside a single zone (probably fairly common e.g. with reverse DNS zones particularly for large dynamic IP ranges) or when the name servers list differs between the parent zone and the zone in question (which most likely is an error, but can be done intentionally as well).


You can see how this works in a little more detail using dig and its +norec (don't request recursion) and @ server specifier features. What follows is an illustration of how an actual resolving DNS server works. Query for the A record(s) for unix.stackexchange.com starting at e.g. a.root-servers.net:

$ dig unix.stackexchange.com. A @a.root-servers.net. +norec

Look closely at the flags as well as the per-section counts. qr is Query Response and aa is Authoritative Answer. Notice that you only get delegated to the com servers. Manually follow that delegation (in real life a recursive resolver would use the IP address from the additional section if provided, or initiate a separate name resolution of one of the named name servers if no IPs are provided in the delegation response, but we'll skip that part and just fall back to the operating system's normal resolver for brevity of the example):

$ dig unix.stackexchange.com. A @a.gtld-servers.net. +norec

Now you see that stackexchange.com is delegated to (among others) ns1.serverfault.com, and you are still not getting an authoritative answer. Again follow the delegation:

$ dig unix.stackexchange.com. A @ns1.serverfault.com. +norec
...
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35713
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 3

;; QUESTION SECTION:
;unix.stackexchange.com. IN A

;; ANSWER SECTION:
unix.stackexchange.com. 300 IN A 198.252.206.16

Bingo! We got an answer, because the aa flag is set, and it happens to contain an IP address just as we'd hoped to find. As an aside, it's worth noting that at least at the time of my writing this post, the delegated-to and the listed-authority name servers lists differ, showing that the two do not need to be identical. What I have exemplified above is basically the work done by any resolver, except any practical resolver will also cache responses along the way so it doesn't have to hit the root servers every time.

As you can see from the above example, the delegation and glue records serve a purpose distinct from the authority and address records in the zone itself.

A caching, resolving name server will also generally do some sanity checks on the returned data to protect against cache poisoning. For example, it may refuse to cache an answer naming the authoritative servers for com from a source other than one that has already been named by a parent zone as deleged-to for com. The details are server-dependent but the intent is to cache as much as possible while not opening up the barn door of allowing any random name server on the Internet to override delegation records for anything not officially under its "jurisdiction".