Do intermediate subdomains need to exist?

Solution 1:

TL;DR: yes intermediate subdomains need to exist, at least when queried for, per definition of the DNS; they may not exist in the zonefile though.

A possible confusion to eliminate first; Definition of "Empty Non-Terminal"

You may be confusing two things, as other answers seem also to do. Namely, what happens when querying for names versus how you configure your nameserver and the content of the zonefile.

The DNS is hierarchical. For any leaf node to exist, all components leading to it MUST exist, in the sense that if they are queried for, the responsible authoritative nameserver should reply for them without an error.

As explained in RFC 8020 (which is just a repeat of what was always the rule, but just some DNS providers needed a reminder), if for any query, an authoritative nameserver reply NXDOMAIN (that is: this resource record does not exist), then it means that any label "below" this resource does not exist either.

In your example, if a query for returns NXDOMAIN, then any proper recursive nameserver will immediately reply NXDOMAIN for because this record can not exist if all labels in it do not exist as records.

This was already stated in the past in the RFC 4592 about wildcards (which are unrelated here):

The domain name space is a tree structure. Nodes in the tree either
own at least one RRSet and/or have descendants that collectively own
at least one RRSet. A node may exist with no RRSets only if it has
descendants that do; this node is an empty non-terminal.

A node with no descendants is a leaf node. Empty leaf nodes do not exist.

A practical example with .US domain names

Let us take a working example from a TLD with a lot of labels historically, that is .US. Picking any example online, let us use

Of course if you query for this name, or even you can get back A records. Nothing conclusive here for our purpose (there is even a CNAME in the middle of it, but we do not care about that) :

$ dig A +short
$ dig A +short

Let us query now for (I am not querying the authoritative nameserver of it, but that does not change the result in fact):

$ dig A

; <<>> DiG 9.11.5-P1-1ubuntu2.5-Ubuntu <<>> A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59101
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 1480
;         IN  A

us.         3587    IN  SOA 2024847624 900 900 604800 86400

;; Query time: 115 msec
;; WHEN: mer. juil. 03 01:13:20 EST 2019
;; MSG SIZE  rcvd: 104

What do we learn from this answer?

First, it is a success because the status is NOERROR. If it had been anything else and specifically NXDOMAIN then, nor could exist.

Second, the ANSWER section is empty. There are no A records for This not an error, this type (A) does not exist for this record, but maybe other record types exist for this record or this record is an ENT, aka "Empty Non Terminal": it is empty, but it is not a leaf, there are things "below" it (see definition in RFC 7719), as we already know (but normally the resolution is top down, so we will reach this step before going one level below and not the opposite like we are doing here for demonstration purpose).

This is why in fact, as a shortcut, we say the status code is NODATA: this is not a real status code it just means NOERROR + empty ANSWER section, which means there is no data for this specific record type but there may be for others.

You can repeat the same experiment for the same result if you query with the next "up" label, that is the name

Queries' results vs zonefile content

Now from where the confusion can come? I believe it may come from some false idea that any dot in a DNS name means there is a delegation. This is false. Said differently, your zonefile can be like that, and it is totally valid and working: IN SOA .... IN NS .... IN NS .... IN A

With such a zonefile, querying this nameserver you will get exactly the behavior observed above: a query for will return NOERROR with an empty answer. You do not need to create it specifically in the zonefile (if you do not need it for other reasons), the authoritative nameserver will take care of synthesizing the "intermediate" replies, because it sees it needs this empty non-terminal (and any others "in-between" if there had been other labels) as it sees the leaf name

Note that this is a widespread case in fact in some areas, but you might not see it because it targets more "infrastructure" records that people are not exposed to:

  • in reverse zones like in-addr.arp or, and specifically the last one. You will have records like 1h IN PTR and there is obviously not a delegation at each dot, nor resource records attached at each label
  • in SRV records, like 12h IN SRV 0 0 43, a domain can have many and SRV records because by design they must have this form, but at the same time and will remain Empty Non-Terminals because never used as records
  • you have in fact many other cases of specific construction of names based on "underscore labels" for various protocols such as DKIM. DKIM mandates you to have DNS records like, but obviously by itself will never be used, so it will remain an empty non-terminal. This is the same for TLSA records in DANE (ex: TLSA 3 1 1 BASE64==), or URI records (ex: _ftp._tcp IN URI 10 1 "")

Nameserver behavior and generation of intermediate replies

Why does the nameserver synthesize automatically such intermediate answers? The core resolution algorithm for the DNS, as detailed in RFC 1034 section 4.3.2 is the reason for that, let us take it and summarize in our case when querying the above authoritative nameserver for the name (this is the QNAME in protocol below):

  1. Search the available zones for the zone which is the nearest ancestor to QNAME. If such a zone is found, go to step 3, otherwise step 4.

The nameserver finds zone as nearest ancestor of QNAME, so we can go to step 3.

We have now this:

  1. Start matching down, label by label, in the zone. [..]

a. If the whole of QNAME is matched, we have found the node. [..]

b. If a match would take us out of the authoritative data, we have a referral. This happens when we encounter a node with NS RRs marking cuts along the bottom of a zone. [..]

c. If at some label, a match is impossible (i.e., the corresponding label does not exist), look to see if a the "*" label exists. [..]

We can eliminate cases b and c, because our zonefile has no delegation (hence there will be never a referral to other nameservers, no case b), nor wildcards (so no case c).

We only have to deal here with case a.

We start matching down, label by label, in the zone. So even if we had a long name, at some point, we arrive at case a: we did not find a referral, nor a wildcard, but we ended up at the final name we wanted a result for.

Then we apply the rest of the content of case a:

If the data at the node is a CNAME

Not our case, we skip that.

Otherwise, copy all RRs which match QTYPE into the answer section and go to step 6.

Whatever QTYPE we choose (A, AAAA, NS, etc.) we have no RRs for as it does not appear in the zonefile. So the copy here is empty. Now we finish at step 6:

Using local data only, attempt to add other RRs which may be useful to the additional section of the query. Exit.

Not relevant for us here, hence we finish with success.

This exactly explains the behavior observed: such queries will return NOERROR but no data either.

Now, you may ask yourself: "but then if I use any name, like then by the above algorithm I should get the same reply (no error)", but observations would instead report NXDOMAIN in that case.


Because the whole algorithm as explained, starts with this:

The following algorithm assumes that the RRs are organized in several tree structures, one for each zone, and another for the cache

This means that the above zonefile is transformed into this tree:

| com |  (just to show the delegation, does not exist in this nameserver)
| example | SOA, NS records
| intermediate | no records
| leaf | A record

So when following the algorithm, from the top, you can indeed find a path: com > example > intermediate (because the path com > example > intermediate > leaf exists) But for, after com > example you do not find the another label in the tree, as children node of example. Hence we fall into part of choice c from above:

If the "*" label does not exist, check whether the name we are looking for is the original QNAME in the query or a name we have followed due to a CNAME. If the name is original, set an authoritative name error in the response and exit. Otherwise just exit.

Label * does not exist, and we did not follow a CNAME, hence we are in case: set an authoritative name error in the response and exit, aka NXDOMAIN.

Note that all the above did create confusion in the past. This is collected in some RFCs. See for example this unexpected place (the joy of DNS specifications being so impenetrable) defining wildcards: RFC 4592 "The Role of Wildcards in the Domain Name System" and notably its section 2.2 "Existence Rules", also cited in part at the beginning of my answer but here it is more complete:

Empty non-terminals [RFC2136, section 7.16] are domain names that own no resource records but have subdomains that do. In section 2.2.1,
"_tcp.host1.example." is an example of an empty non-terminal name.
Empty non-terminals are introduced by this text in section 3.1 of RFC 1034:

# The domain name space is a tree structure.  Each node and leaf on
# the tree corresponds to a resource set (which may be empty).  The
# domain system makes no distinctions between the uses of the
# interior nodes and leaves, and this memo uses the term "node" to
# refer to both.

The parenthesized "which may be empty" specifies that empty non-
terminals are explicitly recognized and that empty non-terminals

Pedantically reading the above paragraph can lead to an
interpretation that all possible domains exist--up to the suggested
limit of 255 octets for a domain name [RFC1035]. For example,
www.example. may have an A RR, and as far as is practically
concerned, is a leaf of the domain tree. But the definition can be
taken to mean that sub.www.example. also exists, albeit with no data. By extension, all possible domains exist, from the root on down.

As RFC 1034 also defines "an authoritative name error indicating that the name does not exist" in section 4.3.1, so this apparently is not the intent of the original definition, justifying the need for an updated definition in the next section.

And then the definition in next section is the paragraph I quoted at the beginning.

Note that RFC 8020 (on NXDOMAIN really meaning NXDOMAIN, that is if you reply NXDOMAIN for, then can not exist) was mandated in part because various DNS providers did not follow this interpretation and that created havoc, or they were just bugs, see for example this one fixed in 2013 in one opensource authoritative nameserver code:

People needed then to put specific counter measures just for them: that is not aggressively caching NXDOMAIN because for those providers if you get NXDOMAIN at some node, it may still mean you get something else than NXDOMAIN at another node below it.

And this was making QNAME minimization (RFC 7816) impossible to obtain (see for longer details), while it was wanted to increase privacy. Existence of empty non-terminals in case of DNSSEC also created problems in the past, around handling of non-existence (see if interested, but you really need a good understanding of DNSSEC before).

The following two messages give an example of problems one provider had to be able to properly enforce this rule on Empty Non-Terminals, it gives some perspective of the issues and why we where there:


Solution 2:

It's possible that I misunderstand Khaled's answer, but the lack of intermediate records should in no wise be a problem with the resolution of the subzoned name. Note that this dig output is not from, nor directed to, an authoritative DNS server for or any subzone thereof:

[me@nand ~]$ dig

Indeed, you should be able to do that dig yourself, and get that answer - is a real domain, under my control, and really does contain that A record. You can verify that there are no records for any of those zones between very and, and that it has no impact on your resolution of the above hostname.

Solution 3:

If you are directly querying the authoritative DNS server, you will get answers without problems.

However, you will not get a valid answer if you are querying via another DNS server which does not have a valid cache. Querying for will result in NXDOMAIN error.