Obtain WHOIS data field(s) without parsing?

The problem appears to be at least two-fold:

  • WHOIS responses do not share a common schema, and
  • there is a dearth of WHOIS clients able to parse WHOIS responses and to map their fields (e.g. using a suitable ontology) onto a single schema. The Ruby Whois project is the most extensive effort I have found. It aims to provide a parser for each of the 500+ different WHOIS servers, and its developers deserve immense credit, but it remains a work in progress.

This is a sorry state of affairs.

The IETF's proposed solution for this and other WHOIS woes is called the Registration Data Access Protocol (RDAP).

Quoting RFC 7485, which explains the rationale for RDAP:

In the domain name space, there were over 200 country code
Top-Level Domains (ccTLDs) and over 400 generic Top-Level Domains
(gTLDs) when this document was published. Different Domain Name
Registries may have different WHOIS response objects and formats.
A
common understanding of all these data formats was critical to
construct a single data model for each object.

(Emphasis mine.)

Unfortunately, whereas most (all?) TLD registries provide WHOIS servers for their subdomains, only one two TLD registries have so far formally fielded RDAP servers for their subdomains: CZNIC for .cz domains, and NIC Argentina for .ar domains. So, this is not (yet) a generally applicable solution across a wide range of TLDs. We can only hope that all the other registries will hurry up and field RDAP servers.

As for software, the only RDAP command line client for POSIX systems that I have found so far is nicinfo.


You may use python

 pip install whois

For instance,

#!/bin/python

import whois
print whois.whois('www.facebook.com')['city']