What is the role of NS records at the apex of a DNS domain?

Solution 1:

Subordinate identification

Apex level NS records are used by a master server to identify its subordinates. When data on an authoritative nameserver changes, it will advertise this via DNS NOTIFY messages (RFC 1996) to all of its peers on that list. Those servers will in turn call back with a request for the SOA record (which contains the serial number), and make a decision on whether to pull down a more recent copy of that zone.

  • It's possible to send these messages to servers not listed in the NS section, but this requires server specific configuration directives (such as ISC BIND's also-notify directive). The apex NS records comprise the basic list of servers to notify under a default configuration.
  • It's worth noting that the secondary servers will also send NOTIFY messages to each other based on these NS records, usually resulting in logged refusals. This can be disabled by instructing servers to only send notifies for zones they are masters for (BIND: notify master;), or to skip NS based notifies entirely in favor of notifies explicitly defined in the configuration. (BIND: notify explicit;)

Authoritative definition

The question above contained a fallacy:

They are not used by caching DNS servers in order to determine the authoritative servers for the domain. This is handled by nameserver glue, which is defined at the registrar level. The registrar never uses this information to generate the glue records.

This is an easy conclusion to arrive at, but not accurate. The NS records and glue record data (such as that defined within your registrar account) are not authoritative. It stands to reason that they cannot be considered "more authoritative" than the data residing on the servers that authority is being delegated to. This is emphasized by the fact that referrals do not have the aa (Authoritative Answer) flag set.

To illustrate:

$ dig @a.gtld-servers.net +norecurse +nocmd example.com. NS
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14021
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 5

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;example.com.                   IN      NS

;; AUTHORITY SECTION:
example.com.            172800  IN      NS      a.iana-servers.net.
example.com.            172800  IN      NS      b.iana-servers.net.

;; ADDITIONAL SECTION:
a.iana-servers.net.     172800  IN      A       199.43.135.53
a.iana-servers.net.     172800  IN      AAAA    2001:500:8f::53
b.iana-servers.net.     172800  IN      A       199.43.133.53
b.iana-servers.net.     172800  IN      AAAA    2001:500:8d::53

Note the lack of aa in the flags for the above reply. The referral itself is not authoritative. On the other hand, the data on the server being referred to is authoritative.

$ dig @a.iana-servers.net +norecurse +nocmd example.com. NS
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2349
;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;example.com.                   IN      NS

;; ANSWER SECTION:
example.com.            86400   IN      NS      a.iana-servers.net.
example.com.            86400   IN      NS      b.iana-servers.net.

That said, this relationship can get very confusing as it is not possible to learn about the authoritative versions of these NS records without the non-authoritative NS records defined on the parent side of the referral. What happens if they disagree?

  • The short answer is "inconsistent behavior".
  • The long answer is that nameservers will initially stub everything off of the referral (and glue) on an empty cache, but those NS, A, and AAAA records may eventually be replaced when they are refreshed. The refreshes occur as the TTLs on those temporary records expire, or when someone explicitly requests the answer for those records.
    • A and AAAA records for out of zone data (i.e. the com nameservers defining glue for data outside of the com zone, like example.net) will definitely end up being refreshed, as it is a well-understood concept that a nameserver should not be considered an authoritative source of such information. (RFC 2181)
    • When values of the NS records differ between the parent and child sides of the referral (such as the nameservers entered into the registrar control panel differing from the NS records that live on those same servers), the behaviors experienced will be inconsistent, up to and including child NS records being ignored completely. This is because the behavior is not well defined by the standards, and the implementation varies between different recursive server implementations. In other words, consistent behavior across the internet can only be expected if the nameserver definitions for a domain are consistent between the parent and child sides of a referral.

The long and short of it is that recursive DNS servers throughout the internet will bounce back between destinations if the records defined on the parent side of the referral do not agree with the authoritative versions of those records. Initially the data present in the referral will be preferred, only to be replaced by the authoritative definitions. Since caches are constantly being rebuilt from scratch across the internet, it is impossible for the internet to settle on a single version of reality with this configuration. If the authoritative records are doing something illegal per the standards, such as pointing NS records at aliases defined by a CNAME, this gets even more difficult to troubleshoot; the domain will alternate between working and broken for software that rejects the violation. (i.e. ISC BIND / named)

RFC 2181 §5.4.1 provides a ranking table for the trustworthiness of this data, and makes it explicit that cache data associated with referrals and glue cannot be returned as the answer to an explicit request for the records they refer to.

5.4.1. Ranking data

   When considering whether to accept an RRSet in a reply, or retain an
   RRSet already in its cache instead, a server should consider the
   relative likely trustworthiness of the various data.  An
   authoritative answer from a reply should replace cached data that had
   been obtained from additional information in an earlier reply.
   However additional information from a reply will be ignored if the
   cache contains data from an authoritative answer or a zone file.

   The accuracy of data available is assumed from its source.
   Trustworthiness shall be, in order from most to least:

     + Data from a primary zone file, other than glue data,
     + Data from a zone transfer, other than glue,
     + The authoritative data included in the answer section of an
       authoritative reply.
     + Data from the authority section of an authoritative answer,
     + Glue from a primary zone, or glue from a zone transfer,
     + Data from the answer section of a non-authoritative answer, and
       non-authoritative data from the answer section of authoritative
       answers,
     + Additional information from an authoritative answer,
       Data from the authority section of a non-authoritative answer,
       Additional information from non-authoritative answers.

   <snip>

   Unauthenticated RRs received and cached from the least trustworthy of
   those groupings, that is data from the additional data section, and
   data from the authority section of a non-authoritative answer, should
   not be cached in such a way that they would ever be returned as
   answers to a received query.  They may be returned as additional
   information where appropriate.  Ignoring this would allow the
   trustworthiness of relatively untrustworthy data to be increased
   without cause or excuse.

Solution 2:

The NS records the delegated zone provide completeness of the domain definition. The NS servers themselves will rely on the zone file. They are not expected to try to find themselves by doing a recursive query from the root servers. The NS records in the zone file provide a number of other functions..

Caching servers may refresh the name server list by querying a name server from its cache. As long as a caching server knows the address of a name server it will use that rather than recursively looking up an appropriate NS record.

When moving name servers, it is important to update the old name servers as well as the new name servers. This will prevent outages or inconsistencies that will result when the two zone definitions get out of sync. The updated records will eventually be refreshed by any servers that have cached the NS records. This will replace the cached list of name servers.

The NS records also assist in confirming correctness of the DNS configuration. Validation software will often verify that the delegating zone's name server definitions matches those provided by the zone. This check may be performed on all name servers. Any mismatches may indicate a misconfiguration.

Having the NS records allow for disconnected (local) zones. These may be sub-domains of a registered domain, or an entirely new domain (not recommended due to TLD changes). Hosts who use the name servers as their name server will be able to find the zones which are not reachable by recursing up from the root servers. Other name servers may be configured to look to the name servers for the local zones.

In the case of split DNS (internal/external), it may be desired to have a different set of DNS servers. In this case the NS list (and likely other data) will be different, and the NS records in the zone files will list the appropriate name server list.