A Really Stupid DNS Server

2025-06-03

Parsing

Flags

I manage my certificates with Let's Encrypt, and for a while now I've been using the DNS challenge for its ability to get wildcard certificates. This makes it easy for me to set up a new subdomain, without having to reconfigure my certificate.

I used to use Cloudflare's API to manage the TXT records, but I swapped some time ago to having a NS record for _acme-challenge.kgugeler.ca point to my VPS, and when I want to renew my certificate, I spawn a public BIND instance. This method is mentioned by Let's Encrypt.

Recently I had a bad idea. BIND is a complex DNS server, right? What if I could replace it with a small server. After all, it's not doing much. What could go wrong?

DNS

Ok, so I know that DNS uses UDP over port 53, and I know it has a recursive structure, and I know about the different record types but... how do I make a DNS query? What does it look like on the wire?

Thankfully Wikipedia has me covered. We've got a short header, and then a list of questions, answers, authority records, and additional records. The questions are sent in queries and repeated in responses. The answers are records that are answers to the questions. Authority records are what we get when the server redirects us to other name servers, and Additional records "relate to the query but are not strictly answers for the question". Let's look at a simple DNS query to the root name servers:

> dig @a.root-servers.net kgugeler.ca

; <<>> DiG 9.20.9 <<>> @a.root-servers.net kgugeler.ca
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 65458
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 9
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;kgugeler.ca.			IN	A

;; AUTHORITY SECTION:
ca.			172800	IN	NS	d.ca-servers.ca.
ca.			172800	IN	NS	any.ca-servers.ca.
ca.			172800	IN	NS	c.ca-servers.ca.
ca.			172800	IN	NS	j.ca-servers.ca.

;; ADDITIONAL SECTION:
d.ca-servers.ca.	172800	IN	A	45.142.220.101
d.ca-servers.ca.	172800	IN	AAAA	2a0e:dbc0::101
any.ca-servers.ca.	172800	IN	A	199.4.144.2
any.ca-servers.ca.	172800	IN	AAAA	2001:500:a7::2
c.ca-servers.ca.	172800	IN	A	185.159.196.2
c.ca-servers.ca.	172800	IN	AAAA	2620:10a:8053::2
j.ca-servers.ca.	172800	IN	A	198.182.167.1
j.ca-servers.ca.	172800	IN	AAAA	2001:500:83::1

We see that there are 4 authority records to point us to the nameservers for the CA TLD - any of the four records will do. We don't have any way of knowing their IPs, which is why those IPs appear in the additional section¹.

It's also worth noting that dig requested recursion, meaning it requested the root nameserver to walk the DNS hierarchy for it, and the root nameserver (naturally) declined - doing recursive DNS processing safely is hard.

Question

A question in the question section is simple enough: we need a name, a type of record, and a class. For our purposes the class will always be IN, the internet class.

Err, hold on, how is that name encoded?

The domain name is broken into discrete labels which are concatenated; each label is prefixed by the length of that label.

That's not very specific... thankfully the RFC has me covered. Each component of the domain is encoded by adding a single byte for the length, and then the content. Don't forget about root domain .! So kgugeler.ca becomes 08 6b 67 75 67 65 6c 65 72 02 63 61 00, where the last zero byte indicates a component of length zero, the root².

Resource Records

These are pretty similar - we need a name, type, and class, just like questions. Then, we need a TTL and the actual data: which is also encoded with length + value. The actual value format is type-specific.

For the purposes of this post, only TXT records matter, and they are encoded as length + value. This means that a TXT record for kgugeler.ca looks like the following:

// kgugeler.ca.
08 6b 67 75 67 65 6c 65 72 02 63 61 00
// TXT is type 16
00 10
// IN is class 1
00 01
// TTL of one minute = 60 seconds
00 00 00 3c
// Our data has length 4
00 04
// Our data is encoded as one length byte + value
// This says "DNS"
03 44 4e 53

Note that the TXT records ends up having two length fields, one outer, one inner.

Name Compression

So far, the protocol seems simple to parse and generate, right? Sure, there are a lot of resource record types, but other than that, super easy!

In order to reduce the size of messages, the domain system utilizes a compression scheme which eliminates the repetition of domain names in a message. In this scheme, an entire domain name or a list of labels at the end of a domain name is replaced with a pointer to a prior occurance of the same name.

Uh, what?

So it turns out that by enforcing that the lengths of domain name components are at most 63 bytes, we can use the upper two bits to signal some other thing. In this case, if the upper two bits are both one, then the next 14 bits (two bytes total) are used as a "pointer" into the message. Any other upper bit pattern is reserved.

The pointer is just an offset from message start. The idea is that if we have a question kgugeler.ca IN TXT, we have to spell out kgugeler.ca. But in the answer, when we need to put the name down again, we can use a pointer:

// Question
// kgugeler.ca.
08 6b 67 75 67 65 6c 65 72 02 63 61 00
// ...
// Answer: this is a pointer to offset 16 from message start
c0 10

We can also have only some suffix be a pointer:

// hi + pointer to offset 16
02 68 69 c0 10

This is a useful feature! Unfortunately, the RFC does not make any suggestions on how to restrict this to ease processing. For instance, if pointers always had to point to smaller offsets, then termination is guaranteed. Or if pointers couldn't point at other pointers, then termination is guaranteed since each pointer will always point to something that increases the length of the domain name, and the domain name has a limit of 255.

I decided to assume that pointers won't point at other pointers, which seems reasonable.

Parsing

Okay, now we can finally parse messages! Here's the segment of the parsing code that actually handles names. The rest is pretty boring, but you can see it here.

class Consumer:
    def __init__(self, packet: bytes):
        self.packet = packet
        self.offset = 0

    def consume_bytes(self, size: int) -> bytes:
        assert self.offset + size <= len(self.packet)
        data = self.packet[self.offset : self.offset + size]
        self.offset += size
        return data

    def consume_ubyte(self) -> int:
        return struct.unpack(">B", self.consume_bytes(1))[0]

    def consume_name(self) -> Name:
        components = []
        total_length = 0

        label_length = self.consume_ubyte()
        while 0 < label_length <= 0x3F:
            total_length += label_length
            assert total_length <= 255
            components.append(self.consume_bytes(label_length))
            label_length = self.consume_ubyte()

        if label_length != 0:
            assert label_length & 0xC0 == 0xC0
            offset = (label_length & 0x3F) << 8 | self.consume_ubyte()
            components.extend(self.resolve_label_pointer(offset, total_length))

        return Name(components=components)

    def resolve_label_pointer(self, offset, total_length) -> list[bytes]:
        # Store global offset for restoring later
        self.offset, offset = offset, self.offset

        components = []

        # Disallow double pointer to avoid DoS
        allow_pointer = False

        while True:
            first_byte = self.consume_ubyte()
            if first_byte == 0:
                break
            elif first_byte <= 0x3F:
                total_length += first_byte
                assert total_length <= 255
                components.append(self.consume_bytes(first_byte))
                allow_pointer = True
            else:
                assert allow_pointer, "Pointer to pointer in message"
                assert first_byte & 0xC0 == 0xC0
                # Set the global offset and continue
                self.offset = (first_byte & 0x3F) << 8 | self.consume_ubyte()

        # Restore global offset
        self.offset, offset = offset, self.offset

        return components

Debugging this code wasn't too bad, since we can send a request to the server with dig and look through it with either python or Wireshark.

Serialization is even easier, since we aren't obligated to output pointers.

Flags

One thing I neglected to mention earlier are the message flags. This is a 16-bit field that contains several flags, including whether the message is a query or reply, the OPCODE, whether recursion is desired by the client, whether recursion is available on the server, and the response code. This isn't too important for parsing, but we need to know this format for generating our responses. I didn't bother to check the flags sent much.

Rewriting it in Rust

For some insane reason, I thought that it would be easier to write this parser in Rust³, since Rust has helpers for dealing with conversions to/from binary data. It was about the same. I don't know what I expected... but I did make a fun realization while writing it in Rust.

Throwing the Parser out the Window

Wait, why are we writing a DNS server again?

ACME

The Automated Certificate Management Environment protocol is used to request a certificate. Let's Encrypt then issues a challenge, in our case, a DNS challenge: we have to prove that we control the domain by placing a TXT record in the domain DNS records.

Who needs a Parser

Hold on, if all we need to do is serve a record from a single domain name, why do we need a parser? We can just ignore what clients ask entirely!

Well, okay, not entirely. The first two bytes of the query are a transaction ID, and if that's different in our reply, DNS clients won't recognize our response⁴.

What happens if a client asks a different question?

> dig @127.0.0.1 sourcehut.org
;; ;; Question section mismatch: got kgugeler.ca/TXT/IN

Dig is not happy⁵.

Does it work?

Yes, it works! I can actually get a certificate provisioned this way. It's not pretty, since there's no easy way to dynamically add a new record yet, but it works!

One interesting thing I found out while doing this is that Let's Encrypt queries the DNS server from multiple locations, in order to prevent BGP hijacking.

Automating it

Naturally, my first thought here is to use Certbot hooks to spawn the server. There are two main issues here:

I need to spawn the server in the background and wait for the cleanup hook to stop it. A natural choice is to use OpenRC since I'm on Alpine, but then the service needs to know the challenge token to serve.
I want a certificate for kgugeler.ca and *.kgugeler.ca... but that means I need to put two records on _acme-challenge.kgugeler.ca, so I need to support that.

Rewriting it in Go??

I was briefly tempted to write the server in Go and integrate with lego, since it has a Go Library. I eventually decided not to, since then I'd have to remember to rebuild everytime lego updates, and I'm lazy.

Control Socket

An easier solution is to simply spawn the server without any TXT records configured. Then, use a control socket to add challenges to the domain.

I was initially going to use a TCP socket with a simple length + value encoding, but it's even easier to just use a UDP socket and have the whole packet be the record to add. Dead simple.

Now the flow goes like this:

Certbot hook runs, the server is started if it hasn't been already. A record gets added.
The hook is run again, adding another record via the control socket.
Let's Encrypt verifies the record.
The cleanup hook stops the server.

This is much nicer to package and script!

But... why?

One reason Let's Encrypt gives for having a separate nameserver for the _acme-challenge subdomain is security: no need to have a DNS API credential on your VPS. And indeed, this is a benefit. The primary reason I swapped to BIND in the first place though was speed: it's instantaneous to verify records, since I can simply set a TTL of 1 second, and they deploy immediately.

But why write this server? It's way simpler than BIND - the entire server is 86 lines of python code. It doesn't do any parsing or validation or anything on untrusted input: it reads two bytes and dumps a pregenerated response.

Limitations

Uh... basically every limitation you can think of?

There's no way this server is close to standards compliant, it can't handle multiple domains at once, it's not supported by any existing tools...

This kind of idea isn't new though - Joohoi's ACME-DNS is a simple DNS server for handling ACME DNS-01 challenges, and is supported by Lego. I haven't looked at the code much, but it's probably what I'd recommend to people looking for a similar setup... there's no way this hack is something I can recommend.

It's very amusing to me that this project works at all, and it's cool how simple the code ends up being. Of course all the ACME work is being done elsewhere, but the fact that we can solve a DNS challenge in such a stupid fashion is both hilarious and satisfying.