Hi Dave @dcrocker,
Thanks so much for your thoughtful response, and sorry for the slow reply. I had a busy end of term and then a trip to Germany and so I’m only now finally catching up on this.
Everything you say seems reasonable. A few small things:
– The reference to CSnet phone numbers was in RFC882
– I see your point about the word “database” but I’m not sure I want to reserve it for systems that have elaborate query languages, since we don’t have a better term that I can think for something that provides key/value lookups (eg). Note also that RFC882 uses that term: “What is needed is a distributed database.”
– You’re absolutely right about CNAME and my misunderstanding of that; I clarified in my note that it’s about introducing an extra level of indirection for delegation (eg to allow easier key rotation).
About my use of the term “overloading”:
– In concept design, overloading means using one concept for multiple purposes (often purposes that are associated with other distinct concepts); it doesn’t have the programming-language sense of the semantics of a term being uncertain until it’s resolved in some context.
About the general approach and what I’m trying to do here:
– The idea of concept design is to clarify what’s going on in a design, and what tradeoffs are involved. I think perhaps you’re reacting to the criticism implicit in my original post that underscore labels were an unprincipled hack. I didn’t mean to give that impression! On the contrary, I think my analysis shows that there is an interesting tradeoff here. As I’ve learned more from you and from deeper reading, I’ve come to appreciate that tradeoff better, and now understand that it’s in large part about avoiding changes to DNS and its resource types. Whether that’s the right way to go (and you’re a better judge of that than me), the analysis at the very least points out the nature of this tradeoff and some of the costs.
I’ve revised (again!) my original post with a more careful analysis of the concepts at play. For completeness in this forum, I’m appending the update below.
A few updates and corrections, following some further investigation and input from DNS experts:
DNS as a general database. The earliest RFCs mention DNS holding information beyond host addresses—including phone numbers for CSNET, for example—and make it clear that the resource records were not to be limited to the initial types. It wasn’t until later, though, that the idea of DNS as a general key/value store seems to have emerged explicitly. Jerry Saltzer, who developed a name service for Athena at MIT called Hesiod, told me that Paul Mockapetris added the TXT resource type to support more general lookups, as required by applications such as Hesiod.
Domain names as intentional names. Domain names that included property labels go back at least to Hesiod, which used an @-symbol to separate the property-specifying part from the rest, eg. firstname.lastname@example.org. A project at MIT in 1999 explored this general idea, in which a name does not designate a service directly, but rather specifies the properties for a desired service, and called it intentional naming. In 2000, RFC 2782 described the addition of the SRV resource type, which mapped domain names of the form _service._protocol.name to server/port names, allowing intentional names such as _ldap._tcp.foo.com.
Domain names that include property keys. A domain name like domainkey.foo.com is not an intentional name that specifies a service. The DKIM protocol does not require a service; all that’s needed is for the DKIM key to provided for the domain. Instead, this domain name is a combination of a domain name (foo.com) and a key to be looked up in the DNS records of that domain name.
Underscores in domain names. The use of underscores in these extended forms of name prevented conflicts with hostnames, but introduced the new risk of the new labels conflicting. In 2019, RFC 8552 described the convention of naming with underscored labels, and introduced a registry to avoid collisions.
Underscore confusions. The early RFCs said that domain names should be as general as possible, but confusing wording misled many people. A much-quoted statement from RFC 882 seems to say that underscores are not permitted in the labels that comprise domain names: “The labels must follow the rules for ARPANET host names. They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen.” This statement, however, seems on closer reading to be an informal explanation for a grammar that is not intended to be mandatory: “The preferred syntax of domain names is given by the following BNF rules. Adherence to this syntax will result in fewer problems with many applications that use domain names (e.g., mail, TELNET).” This complicated position is elaborated in RFC 1035 which states:
The DNS specifications attempt to be as general as possible in the rules
for constructing domain names. The idea is that the name of any
existing object can be expressed as a domain name with minimal changes. However, when assigning a domain name for an object, the prudent user
will select a name which satisfies both the rules of the domain system
and any existing rules for the object, whether these rules are published
or implied by existing programs.*
Not surprisingly this has confused even experts; a ballot amongst a consortium of companies voted to sunset the use of underscores in DNS names appearing in certificates, citing the statement in 1035 that “labels must follow the rules for ARPANET host names” which it took, incorrectly, to specify “the characters which may be used in DNS domain names.”"
CNAME records for DKIM. Use of CNAME resource records for DKIM is not (I believe), as I originally suggested above, to avoid the use of TXT records, but rather to provide an extra level of indirection so that a domain can delegate to a hosting service the job of assigning (and rotating) DKIM keys. The use of the _domainkey prefix alone would limit the number of TXT records returned, since the extended domain name has a different set of resource records associated with it than the base domain name.
Refining the concept design analysis. In summary, there are (at least) three distinct concepts in play here. First there is the concept of a HierarchicalName, which allows a name space to be divided into separately managed zones. This concept is familiar from file systems, and from the structure of many web APIs (which use so called “RESTful” names for resources). Second is the concept of IntentionalName, in which a name becomes a kind of specification. This also existed prior to DNS, although the concept has been less widely adopted. Third is the concept of Metadata in which an object has a collection of properties associated with it; a photo, for example, has its capture time and exposure; a file has its creation time; a DNS domain has its domain key.
The lens of concept design helps us recognize that much of the richness of DNS that we have explored comes from the fact that three distinct concepts are being offered. The familiarity of these existing concepts should make DNS easier to understand.
What is unusual is that all three concepts are implemented by the same mechanism. In concept lingo, the second and third concepts are “piggybacked” onto the first concept, with the properties and specifiers of Metadata and IntentionalName respectively both represented as labels in a prefix of a domain name. Like many piggybacking designs, this is ingenious and solves some problems. In particular, it allowed DNS itself to remain unchanged, without the need for new resource types or mechanisms (although it did require the creation of a new registry to avoid name clashes).
The downside is that piggybacked concepts generally cannot be fully supported by a mechanism that was not designed for them. A full implementation of IntentionalName, for example, would allow wildcard specifiers, so that for example one could request not only a color printer _printer._color.local.foo.com or a monochrome printer _printer._mono.local.foo.com but also any printer _printer.*.local.foo.com whether color or monochrome. As noted in RFC8552, DNS cannot support such wildcards. Another price paid for the piggybacking is some additional complexity involved in squeezing the new functionality into a Procrustean bed—here, the underscore, and all the confusion created about whether it is permitted.
Whether the DNS design is bad or good is not the main issue here—and now I have a better appreciation of the tradeoffs I am less inclined to insist that this piggybacking is a mistake. What my analysis shows, I hope, is that concept design can reveal the underlying issues and make clearer whatever tradeoffs are being made.