I recently had to undertake some DNS performance testing to ascertain the impact of enabling querylogging in BIND. The last time I did this must have been over 15 years ago and I was able to conclude back then that enabling querylogging had a substantial impact on the maximum throughput a DNS server could achieve. Unfortunately the paper I wrote has long since been lost and technology has moved on considerably since then, so I was curious to repeat this testing using more modern equipment. In addition, new features such as dnstap are available to perform DNS logging, so I thought it might be useful to test that as well.
Despite wanting to use modern equipment, the only spare physical system I had available was an old HP ProLiant ML110 G6, which has a quad-core Xeon 3430 CPU running at 2.4GHz. My system has 16GB RAM and SATA disk I/O subsystem (no SAS I’m afraid).
I installed BIND on a bare-metal system running AlmaLinux 9. I do have access to VMWare ESXi, but that system is heavily resource constrained so didn’t think it fair to use that as the target system (although a physical vs virtual comparison might be interesting – maybe I’ll do a future post about that). The version of “named” I am using is 9.18.29 and I have disabled recursion and configured a single zone containing 750,000 resource records (which I also used as my source data for the testing tool).
The testing tool I used is dnsperf – I installed a copy in an AlmaLinux VM running on VMware ESXi server, and also installed Cygwin64 on my Windows 11 laptop with the dnsperf package. This allows me to run multiple copies of dnsperf to max out the DNS server.
To ensure no network equipment is slowing down the traffic, I ensured the BIND server was running on the same IP network as my clients, so there is no router to traverse, all queries will be local and everything is directly connected via a 1Gbps switch.
I executed multiple copies of dnsperf in different windows (two on VMWare and two or three more in cygwin windows) until the DNS server showed 100% CPU utilisation, using the following command:
dnsperf -s 192.168.0.210 -d bigzone-input.txt -c 10 -l 300 -t 1 -T 4 -S 10
Running “htop” produces this output showing 100% CPU:
Using “iftop” I could monitor the network throughput (averaging around 180Mbps), so this proved there were no network constraints:
I ran the tests multiple times, first with no logging enabled whatsoever, then with dnstap enabled, next disabling dnstap but enabling querylogging, finally with both dnstap and querylogging enabled. This meant I saw the maximum throughput get progressively lower with each test (as expected). The dnstap configuration was very simple, just writing data out to a file.
After each test completed, dnsperf produce some statistics including the average maximum query per second rate it was able to sustain (QPS).
Here are the results:
| Test | Test Description | Aggregated total QPS (5 clients, pass 1) | Aggregated total QPS (5 clients, pass 2) |
| Case A | dnstap off, querylog off | 142018 | 147702 |
| Case B | dnstap on, querylog off | 131677 | 131966 |
| Case C | dnstap off, querylog on | 40372 | 39574 |
| Case D | dnstap on, querylog on | 38955 | 37754 |
To conclude, querylogging has a major impact on throughput, the server can only handle approximately 25% of the QPS with querylogging enabled, vs having querylogging disabled.
Dnstap does not have anywhere as much impact on performance and should always be considered as an alternative to querylogging. Even writing to the disk, which on my system is only a SATA drive, did not have a huge impact in performance, based on the figures above it only produced somewhere in the region of 7-10% drop in throughput – this compares very favourably with querylogging.
There are other technologies that can be used with dnstap, one I have come across is “vector” by DataDog, this can be used to take the raw dnstap output and transform it into whatever data stream you require (eg. JSON) – the data doesn’t even need to be written to disk and can be sent straight to a network socket for onward transmission to an external SIEM such as Splunk, this would remove any disk I/O constraints and should have even less impact on QPS throughput. With this technology it shouldn’t be necessary to enable querylogging at all in BIND.


