备注学院

LuLu

  BlogJava :: 首页 :: 新随笔 :: 联系 :: 聚合  :: 管理 ::
  5 随笔 :: 50 文章 :: 16 评论 :: 0 Trackbacks

On the surface of it, DNS seems pretty straightforward, simply converting names to numbers, or retrieving other information about a domain. It's actually a huge and complicated subject and books on it tend to be quite wide. Thankfully, for the purpose of what we're doing we don't need to understand very much at all - just how to create a query, send it to a server, and interpret the response. The most common query that a DNS server deals with is the ANAME query, which maps domain names to IP addresses (codeproject.com to 209.171.52.99, for example). System.Net.Dns.GetHostByName performs ANAME lookups. Probably the next most common type of query is the MX query.

Unlike many of the internet protocols which are text based, DNS is a binary protocol. DNS servers are some of the busiest computers on the internet, and the overhead of string-parsing would make such a protocol prohibitive. To keep things fast and lean, UDP is the transport of choice being lightweight, connectionless and fast in comparison to TCP. To communicate with a DNS server, you simply throw a single UDP packet at it and it throws one back. Oh, and these packets cannot exceed 512 bytes in length. (Incidentally, many firewalls block UDP packets larger than 512 bytes in length.)

The diagram below shows the binary request I sent to my DNS server to look up the MX records for the domain microsoft.com and the corresponding response I received. To do this, I sent a 31 byte UDP packet to port 53 of my DNS server as shown below. It replied with a 97 byte response again on UDP port 53.

Both request and response share the same format, which starts with a 12 byte header block. This starts with a two byte message identifier. This can be any 16 bit value and is echoed in the first two bytes of the response, and is useful as it allows us to match up requests and responses as UDP makes no guarantees about the order in which things arrive. After that follows a two byte status field which in our request has just one single bit set, the recursion desired bit. Next comes a two byte value denoting how many questions there are in the request, in this case just 1. There then follows three more two byte values denoting the number of answers, name server records and additional records. As this is a request, all these are zero.

The rest of the request is our single question. A question consists of a variable length domain name, a two byte QTYPE and a two byte QCLASS in that order. Domain names are treated as a series of labels, labels being the words between dots. In our example microsoft.com consists of two labels, microsoft and com. Each label is preceded by a single byte specifying its length. The QTYPE denotes the type of record to retrieve, in this example, MX. QCLASS is Internet.

Test Application

The response we get back tells us that there are three inbound mail servers for the domain microsoft.com, maila.microsoft.com, mailb.microsoft.com and mailc.microsoft.com. All three have the same preference of 10. When sending mail to a domain, the mail server with the lowest preference should be tried first, then the next lowest etc. In this case, there is no preference difference and any of the three may be used. Let's look a bit more closely at the response.

You may have noticed that the first 31 bytes of the response are very similar to the request, the only difference being in the status field (bytes 2 & 3) and the answer count (bytes 6 & 7). The answer count tells us that three answers follow in the response. I refer those who are interested in the make up of the status field to the above RFC section 4.1.1, as I will not cover that here. You'll also notice that the question is echoed in the response, something which seems rather inefficient to me, but that's the standard. The first answer starts at byte 31 (0x1F).

The first part of any answer embeds the question in it so if you ask more than one question you know to which question the answer refers. A shortened form is used - rather than repeating the domain microsoft.com explicitly here which is wasteful when we've only got 512 bytes to play with. We reference the existing domain definition at byte 12 (0x0C). This requires just two bytes instead of 15 in our example. When examining the label length byte which precedes a label, if the two most significant bits are set, this denotes a reference to a previously defined domain name and the label does not follow here. The next byte tells you the position in the message of the existing domain name. Again the QTYPE and QCLASS follow, and then we start to see the part which is the answer.

The next four bytes represent the TTL (time to live) of the record. When a DNS server can't answer a question explicitly, it knows (or can find out) another server which can and asks that. It will then cache this answer for a certain period to improve efficiency. Every record in the cache has a TTL after which it will be destroyed and re-fetched from elsewhere if needed.

The next two bytes tell the size of the record, the next two the MX preference, and then follows the variable length domain name. Here we only specify the mailc part of the domain, and then again reference the rest of the domain name at byte 12 (to produce mailc.microsoft.com). Two almost identical records follow for maila.microsoft.com and mailb.microsoft.com.

posted on 2008-08-07 23:26 smildlzj 阅读(279) 评论(0)  编辑  收藏 所属分类: Web开发

只有注册用户登录后才能发表评论。


网站导航: