25
DNS Dynamic Update Performance Study 2014.10.10

DNS Dynamic Update Performance Study 2014.10.10. The Purpose Dynamic update and XFR is key approach to perform zone data replication and synchronization,

Embed Size (px)

Citation preview

DNS Dynamic Update Performance Study

2014.10.10

The Purpose

• Dynamic update and XFR is key approach to perform zone data replication and synchronization, to study its performance limitation is meaningful to estimate the efficiency of the whole DNS system

• Provide operational practice to DNS operators.

• Provide improvements to DNS standard and software implementation

• Data flow: Primary master -> master -> slave• Generate root zone file, and record the initialized SOA

serial number s0.• Record current time t0 and start to keep sending n

numbers of update requests to primary master without waiting for the ACK from server. Each request is to adding one new TLD which include one NS and one related glue.

• At the same time, without waiting for the sending finish, keep querying all three servers, record the time when the SOA serial of respective server reaches to s0 + n, record the final time t1.

• For each server the UPS(update per second) is (t1 – t0)/n

Test Method

Factors may affect the performance

• Zone size• Query pressure of slave node• DNSSEC (not only affect the zone size, but also

complicate the update process)• Hard driver write performance

Test Environment

• Network topology• Hardware configuration• OS/DNS software

Network Topology

Hardware Configuration

Controller:OS: Centos 6.4 x86_64CPU : Intel(R) Xeon(R) CPU E5-2403 v2 1.80GHzMemory : DDR3 1333 2GHard driver: ST500DM002-1BD142 7200 16M

Primary Master/master/slave:OS : Centos 6.4 x86_64/Freebsd 10.0 x86_64CPU : Intel Xeon E3-1220v2 3.1GHZ 4 cores 4 ThreadsMemory : DDR3 1333 ECC 32GHard driver: ST500DM002-1BD142 7200 16M

Dns Software

• Primary master – BIND(9.9.5)

• Master– BIND(9.9.5)

• Slave– BIND(9.9.5)– NSD(3.2.18)– KNOT(1.5.1)

UPS VS TLD Count(without DNSSEC)

UPS VS TLD Count(with DNSSEC)

UPS vs QPS on Slave Node

Performance Analysis

• For primary master, the update procedure is:– Generate the difference (update validation)– Apply the diff to memory DB– Write to journal file– Mark zone to dirty and later synchronize memory

data with zone file– Notify other name servers

• The bottleneck is hard driver write– To make all the modification persistent, BIND will

make sure the journal file is written into disk, which using fsync

Whether is better with SSD?

Hardware Configuration

Primary Master (mac pro):OS: OS X 10.9.5CPU : 2.4 GHz Intel Core i5Memory : 8 GB 1600 MHz DDR3Hard driver : APPLE SSD SD0256F Media

Slave (mac air):OS : OS X 10.9.5CPU : 2.7 GHz Intel Core i5Memory : 4 GB 1600 MHz DDR3Hard driver : APPLE SSD SD0256F Media

UPS VS TLD Count(without DNSSEC)

UPS VS TLD Count(with DNSSEC)

UPS VS QPS (UDP/DO)

Persistent DB vs Memory DB

• Like root server system, most distributed DNS system stores RRs into rational DB, using DNS server to provide query and zone synchronization service.

• Modify BIND without generating journal file and synchronizing zone file with memory DB to promote the performance.

• The following test result is based on the first test environment with modification BIND running on primary master.

UPS vs TLD Count(without DNSSEC)

UPS vs TLD Count(with DNSSEC)

UPS vs QPS (UDP/DO)

Conclusion

• The updating for one zone is sequential, therefore multi-core won’t help.

• Without persistent guarantee, dynamic update itself is quite efficient

• DNSSEC affect the performance by 50% decrease

• For each hierarchy level, the performance is dropped by 20~30%

• If memory is sufficient, zone size has little impact on update performance.

• UDP query pressure also has little impact. Mainly because computation resource and file descriptor resource are sufficient.

• For slave node, under update pressure, if KNOT receive IXFR exceeding 1024 serial number change, it will fall back to AXFR which will cause more transfer time and zone file synchronization time. It is the reason why it slower than NSD at some point, and more bigger the zone size, more slower the transfer.

What’s next

• The affection of hierarchy depth is tested, the width of it is another important factor of the performance, with more resources, the test will be performed in the near future.

• The testing is under LAN, when transfer across WAN, the behavior should be different.

Q & A