Monday, November 26, 2012

OpenShift Back End Services: The DNS - Concepts

Most of the time, DNS just works. I have heard some people (hardware lab techs mostly) talk about how they use IP addresses for their work because DNS is so unreliable.  Then I explain to them that (at least outside the lab) DNS is probably the most reliable and fundamental service on the modern internet.  Next to the TCP/IP and the routing protocols, DNS is the most critical service.  Without DNS the rest of the net doesn't matter because no one can find anything.  But very seldom does DNS actually fail on a large scale.

For a system like OpenShift, DNS is life's blood.  The whole purpose of a PaaS is to make applications available to users.  OpenShift does that by adding a new DNS record each time an application is created.  The name portion of the record is crafted from the developer's namespace and application name.  The value portion directs a user to the node host which offers the application service.  But before the browser can find the application, the DNS resolver must find the DNS record.

If you're going to run your own DNS it's important to understand how DNS services interact.

What's so hard about DNS?


Given the ubiquity of DNS and its critical function, I've been surprised at the amount of difficulty it has caused configuring it into OpenShift Origin.  I think part of the issue is that ordinarily it works so well that few sysadmins and developers have to work with it in any depth.  Most people's experiences with DNS consist of checking and setting the /etc/resolv.conf nameserver and search lists, and the occasional dig command to check if a zone is responding.

In most companies there's one or two of the IT folks who are "The DNS guys" (or girls?).  They manage the external DNS (which shouldn't change fast) and the internal DNS (which uses lots of DHCP for desktops, laptops, wireless).  They own the DNS domains and getting new IP name/address assignments from them has a well defined process.  Getting a delegated sub-domain is generally a more involved process.  The DNS Guys don't like to do it (because they get the calls when your DNS  breaks) so people who need DNS (like lab spaces) will make do with their own or go without.

A number of geeks like me have set up split DNS in their houses.  This requires DNS forwarding features, but not delegation.  There are even tools now like Dnsmasq which implement simple split DNS and combine DNS, DHCP and Dynamic DNS all in one nice relatively simple service. These are meant for small labs or home networks where they can control the entire DNS namespace.  They provide the hostname to IP address mapping of DNS and combine that with the host resolver configuration offered by DHCP. They only work at the bottom of the namespace hierarchy and they do not require delegation, as nothing in the local database is ever published outside that bottom layer zone. Again, this removes the need for the average system administrator to think much about what's happening behind the scenes.

And there's sometimes rather a lot going on behind the scenes.

The Domain Name Service Behind the Scenes


If you're familiar with DNS operations, you can skip this part.

If you've never managed a DNS hierarchy, you might think that to install a DNS service you just install the bind package, edit the configuration file, add some zone data and start the daemon. Done.  Right?  Not quite.  Unlike nearly all other typical database services, DNS requires the participation of other servers to work properly.

The DNS  is a specialized distributed database with a hierarchical namespace.  Note something really important here.  I didn't say "Bind is..".  I said "The DNS is.." The DNS is ONE DATABASE. The data is distributed across the entire internet and the authority for portions are delegated to hundreds of thousands (or more) origanizations and individuals, but if it weren't a single unified entity it wouldn't work.  The magic of the DNS is the way in which the namespace and data are "glued" together. To see how that works I'm going to walk through an example.

DNS Queries and the /etc/resolv.conf file


When I try to access a web site from my laptop, the first thing that happens is a DNS query to resolve the URL host name to the IP address of the destination host. My computer is going to send a DNS request to some server. A computer that answers DNS requests is called a nameserver.  I have to know the address of the nameserver.  If I only knew the name of the name server, I'd have to do a look-up query for that, and since I don't have the address of someone to ask I'd be stuck in a loop.

Fortunately, the pump is primed by the /etc/resolv.conf file.

The /etc/resolv.conf file contains a list of nameserver entries. This is a list of IP addresses.  Each of the addresses corresponds to a DNS nameserver host.

; generated by /usr/sbin/dhclient-script
search westford.example.com example.com
nameserver 192.168.4.2
nameserver 192.168.4.3
nameserver 172.30.41.12

When I make a request with a hostname, the resolver library on my laptop issues a query to the first nameserver in the list.  Say I wanted to visit openshift.example.com.  The resolver would send a query which asks essentially "tell me what you can about anything named 'openshift.example.com'"  You can simulate this with either the dig or host commands.

The dig and host commands


The dig and host commands are programs who's only purpose is to issue DNS queries and report the responses.  I tend to use host when all I need is the answer.  host has much simpler output and looks to me like it is designed for use in command-line scripts.  It responds with a single line and each field in the output is space separated.

host www.example.com
www.example.com has address 192.0.43.10
www.example.com has IPv6 address 2001:500:88:200::10

I use dig when I am verifying or diagnosing DNS operation.  By default dig prints a (mostly) human readable  report of the entire DNS response from the nameserver.  This includes not only the requested records but the authority records which indicate where the answer ultimately came from.  The format is not horribly human friendly or even string-parser-friendly, but once you learn to read it it is very concise and informative.

dig www.example.com

; <<>> DiG 9.9.2-RedHat-9.9.2-2.fc17 <<>> www.example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30626
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 5

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.example.com.  IN A

;; ANSWER SECTION:
www.example.com. 172791 IN A 192.0.43.10

;; AUTHORITY SECTION:
example.com.  163951 IN NS a.iana-servers.net.
example.com.  163951 IN NS b.iana-servers.net.

;; ADDITIONAL SECTION:
b.iana-servers.net. 1044 IN A 199.43.133.53
b.iana-servers.net. 1044 IN AAAA 2001:500:8d::53
a.iana-servers.net. 1044 IN A 199.43.132.53
a.iana-servers.net. 1044 IN AAAA 2001:500:8c::53

;; Query time: 43 msec
;; SERVER: 10.11.255.156#53(10.11.255.156)
;; WHEN: Mon Nov 26 18:14:47 2012
;; MSG SIZE  rcvd: 196


In the examples that follow I'm mostly going to use dig though I may also trim them some to highlight the important parts.


A Name Lookup Example


This example is not at all contrived.  This kind of thing happens to me every day.. Really..  Sure it does.

Say I'm sitting at my desk, working on my laptop.  I'm a mid level sysadmin at Example.Com in the office in Boston, MA.   My desktop has a DNS name something like "llamadesk.boston.example.com".  My co-worker (the lucky SOB) is working from the beach outside company office on Maui.  He posts a document file on a web server in the office there.  The web server hostname is "www.maui.example.com". I want to see the document so he sends me a url for it and I dutifully paste it into my web browser address bar and hit the enter key.

My web browser is linked with the local resolver library (usually called libresolv). It has a function called getaddrinfo (used to be gethostbyname) which takes a hostname string as input and returns (among other things) an IP address associated with that name.  What happens between the call and the return is the interesting part.

The first thing the resolver library does is read /etc/resolv.conf. Then it crafts a query packet and sends it off to the first IP address in the nameserver list and waits for a response.

The nameserver is listening for query packets. I receives the packet and tries to find the best answer.

A nameserver, when looking at a query, can know one of three things:

  • I know the answer.
  • I don't know the answer, but I know the nameserver for a domain that contains the answer.
  • I don't know the answer or the domain, but I know where the root domain is.

A server which knows the answer is called the authoritative nameserver for the domain.  It will answer all queries for the contents of the domains it serves.

If the nameserver is not authoritative it then has a choice.  It can merely return a response which means "I don't know" or it can perform a recursive query.  Most of the nameservers which are at the edge of the DNS where desktops will be making queries will be configured to recurse.

So, my nearest (recursive) nameserver is in boston.example.com. It doesn't know about servers in maui.example.com. However, it does know that it's in example.com and it knows how to find the example.com nameservers.  These are servers in the NS records for example.com. So the nearby nameserver issues a query to one of the NS servers for example.com and asks for the NS records from the maui.example.com domain.  The reply will contain the names of the authoritative nameservers.  The nearby nameserver then requests the A records for the maui.example.com nameservers.  Now it knows someone who does know the answer.  It sends one final query to the maui nameserver for www.maui.example.com.  The maui nameserver returns the answer (or an error response) and the local nameserver returns the answer to my browser which can finally make a connection to the actual target host.

Did you follow all that?  See if this helps:


Or this?
  1. llamadesk -> ns1.boston.example.com
    "tell me the IP address for www.maui.example.com"
  2. ns1.boston.example.com -> ns1.example.com
    "tell me who serves maui.example.com"
  3. ns1.boston.example.com -> ns1.example.com
    "tell me the IP address for ns1.maui.example.com"
  4. ns1.boston.example.com -> ns1.maui.example.com
    "tell me the IP address for www.maui.example.com"
  5. llamadesk <- ns1.boston.redhat.com
    "here's the IP address you asked for"

Glue Records: Binding the Internet Together

The links that make the DNS work are known as glue records. The process of establishing a link between one layer in the hierarchy and the next layer down is called delegation.

The nameserver at the example.com level has to know about all of the sub-domains below example.com.  It must have two types of records for each sub-domain.  It must have a set of NS records which contain the DNS name of the authoritative servers for the sub-domain.  Since NS records return the hostname of the nameservers, the parent must also provide an A record for each nameserver.

As noted in other places, the technical aspects of delegation are much less significant than the political or organizational aspects.  Delegation requires the establishment of a relationship communication and of trust between groups that may normally be somewhat territorial. Once a sub-domain is delegated it is the responsibility of the receiving administrator to ensure that the domain is always available so that the it contains remain accessible and to be able to accept and respond to problem reports.

Development and Test Environments: Rogue DNS

"Rogue DNS" is the term I use for an undelegated DNS zone.  Some people try to soften the term but I think "rogue" carries just the right connotations.  A rogue is an independent, slightly unsavory character who none the less is capable and possibly even attractive in a "bad boy" sort of way.  A rogue is not always a bad guy and sometimes it takes a rogue to save the day.

Every NAT network which includes split DNS would be considered "rogue" under this definition.  That's pretty much every home network and most commercial business networks today.  Rogues aren't all bad.

Rogue zones are also common in testing and small or personal development environments.  They don't require any negotiation. They're pretty much required for demos or livecd try-it-out implementations.

The establishment of a rogue zone is really easy: Just create the servers and start adding resource records to the zone. The problem is that without delegation, the rogue zone is invisible to everyone else. 

The real problem with rogue DNS is that every client that wants to participate in the zone must be manually re-configured to see the rogue.  The first nameserver in the client /etc/resolv.conf must be one of the rogue name servers.  In a typical NAT environment the owner of the DNS also owns the DHCP services. Since the DHCP server also provides dynamic nameserver information, all of the DHCP clients automatically participate in the DNS as well.  In a lab setting, the lab administrators may control the DHCP as well.

Rogues and DNS Forwarding


The other problem with a Rogue DNS service is that the rogue. because it is not delegated, does not know about anything outside itself.  Bind does have a facility for "forwarding" requests.  This is commonly used in NAT environments.

When a forwarding server gets a query for which it is not authoritative, rather than trying to recurse, it will forward the request directly to one of a list of "upstream" servers.  These are usually the servers that would normally have been in the nameserver host's /etc/resolv.conf.

Dynamic DNS


The final concept to cover is Dynamic DNS.  As noted in the previous post, OpenShift depends on the ability to add and remove resource records from a DNS zone.

In most corporate DNS services the zone files are fairly static.  They are often mechanically generated from some other database on a regular basis.  It is common for DNS updates to require from 1 or 2 hours to as much as 24 hours.   OpenShift requires the updates to be applied instantly and propagation times of more than a few seconds are considered unacceptable performance.

The one exception is DNS assigned from DHCP.  Microsoft Active Directory is especially good at this.  Dnsmasq is a combined DNS/DHCP/TFTP service designed for home and small business NAT networks.  When it assigns an IP address it can also bind the address to a hostname requested by the client.  It is also possible to connect ISC Bind and ISC DHCP to do Dynamic DNS.

OpenShift does Dynamic DNS through the DNS plugin.  I want to say "plugins", but right now there is only one DNS plugin.  I've written a few posts on writing a new DNS plugin, but it needs the last few, and the sample I picked will only be useful for labs.  Personally I think we need plugins for the greatest possible variety of external DNS services, from Microsoft Active Directory DNS to commercial services.

 DNS Update services will all have similar communications requirements.  Server and access information as well as the zone to update and the new resource record content.

Closing


I'm sure you'll agree I've lectured enough on the relevant capabilities and behaviors of DNS.  I mostly went through this exercise to be sure I hadn't missed anything myself.

You'll notice that I refer a lot to RFCs (Request for Comments).  These are the official specifications for the behavior of parts of the internet.  A lot of people find the idea of the RFCs intimidating. They're dense and bland.  They're also your friend. Don't be scared to go looking for information you need.  You don't have to read them like a good novel, but it's good to scan the relevant documents and at least know where to find answers.  I think a lot of people also skip the RFCs because people are looking for "how do I do it".  The RFCs only tell you "How does it work".  I think often the latter helps illuminate the former.

When I go looking for an RFC, I usually don't know the right one to look for.  Use the search engines
Google for your topic and add "RFC" to the beginning of the query and you'll very likely get a good reference.

Scan the RFCs. You'll be glad you did.

The next post will describe the creation of an authoritative DNS server using ISC Bind 9.  As I go along I mean to include not only the configuration steps but to demonstrate some tools and resources for checking the status of the service and for diagnosing any problems that might arise.

References

RFCs specifically significant to OpenShift:
  • RFC 1033 - Domain Administrators Operations Guide
  • RFC 1034 - Domain Names - Concepts and Facilities
  • RFC 1035 - Domain Names - Implimentation and Specification
  • RFC 2136 - DNS Update
  • RFC 2845 - Secret Key Transaction Authentication for DNS (TSIG)



3 comments:

  1. Thanks, I think now I can talk about delegation without that nagging voice of uncertainty in my head.

    ReplyDelete
  2. Nice article . DNS means Domain name Server or Domain name system.DNS helps to convert the ip address into domain name.Domain name into ip address.You cannot remember many website ip address at same time.So domain name is used for remembering the website.
    For example 74.125.239.17 is one ip address of Google.com.It is very difficult to remember always.So you can use domain name to view the particular website.The primary work of DNS is if you give ip address in browser ,it search for domain name of particular ip address.If you give domain name in the address bar of the browser it convert the domain name to ip address(Search for the particular ip address).You can check the domain owner (who is already registered with the domain name that you want) information via Whoisxy.com . It also displays domain registrant phone number , address ,mail etc..

    ReplyDelete