BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

View: New views
6 Messages — Rating Filter:   Alert me  

BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

by Kris Kennaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have been benchmarking BIND 9.4.1 recursive query performance on an
8-core opteron, using the resperf utility (dns/dnsperf in ports).  The
query data set was taken from www.freebsd.org's httpd-access.log with
some of the highly aggressive robot IP addresses pruned out (to avoid
huge numbers of repeated queries against a small subset of addresses,
which would skew the results).

Testing was done over a broadcom gigabit ethernet cable connected
back-to-back between two identical machines.  named was restarted in
between tests to flush the cache.  resperf is designed to slowly
increase the query rate over a period of 60 seconds, up to a maximum
query rate, to determine the point at which the server starts to fall
behind on answering queries.  To more accurately measure this point,
in each case I tuned the maximum query rate so that the server fell
behind after around 50 seconds of load.

7.0 was used with up-to-date CVS sources and the SCHED_SMP (enhanced
SMP) scheduler, which is not yet committed but for which patches have
been posted by Jeff Roberson.  Actually this did not make much
difference compared to ULE on this workload, although I didn't graph
ULE.  BIND 9.4.1 from the base system was used for the threaded
version, and the bind94 port with threads disabled for comparison.
All debugging was disabled.

6.2 was used from CVS with libthr and the 4BSD scheduler (ULE 1.0 is
broken in 6.x).  In addition I also tested a previously posted patch
from rwatson that may be found here:

  www.watson.org/~robert/freebsd/netperf/20070311-sosend_dgram.diff

The results show several interesting things:

  http://obsecurity.dyndns.org/bind-resperf.png

Firstly, 7.0 beats 6.2 across the board, and has about 60% higher peak
performance.  BIND does not scale beyond 4 worker threads, but this
appears to be due to high contention on pthread mutexes in userland,
i.e. a BIND design problem rather than a FreeBSD kernel problem.
There is moderate UDP contention that, if it can be optimized, might
increase peak performance but is not likely to improve scaling.  For
now it appears that BIND 9.4 does not scale to >4 CPUs.

FreeBSD 6.2 seems to have at least two major performance bottlenecks,
due to file descriptor locking, and poor scaling of the old sx lock
implementation (both have been fixed in 7.0).  I actually don't know
what is using the sx locks so heavily in 6.2, there does not appear to
be an analogue on the 7.0 lock profile.  There are other optimizations
in 7.0 that are probably responsible for a smaller part of the
difference.

Robert's patch gives a modest boost to 6.2 at light concurrency but is
swamped by the other scaling problems at high load.  The graph should
not be interpreted as showing that this patch performs worse at high
load; the variance is so enormous that it is easily consistent with
the CVS data.

It would be interesting to test BIND performance when acting as an
authoritative server, which probably has very different performance
characteristics; the difficulty there is getting access to a suitably
interesting and representative zone file and query data.

Kris



attachment0 (194 bytes) Download Attachment

Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

by NOC Meganet :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday 14 June 2007 05:48:17 Kris Kennaway wrote:
> 6.2 was used from CVS with libthr and the 4BSD scheduler (ULE 1.0 is
> broken in 6.x).

just curious what is broken because I use ULE on several servers perfectly. it
seems to me that ULE is even faster on SMP when not having heavy load.
Also "calcru went backwards" issues I do not get with ULE but sporadically on
4BSD scheduler kernels, specially on dualcore cpus.



HM

--

Prowip Telecom Ltda
AS 22706







A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura.
Service fornecido pelo Datacenter Matik  https://datacenter.matik.com.br
_______________________________________________
freebsd-smp@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..."

Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

by Kris Kennaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jun 14, 2007 at 09:36:55AM -0300, NOC Meganet wrote:
> On Thursday 14 June 2007 05:48:17 Kris Kennaway wrote:
> > 6.2 was used from CVS with libthr and the 4BSD scheduler (ULE 1.0 is
> > broken in 6.x).
>
> just curious what is broken because I use ULE on several servers perfectly. it
> seems to me that ULE is even faster on SMP when not having heavy load.
> Also "calcru went backwards" issues I do not get with ULE but sporadically on
> 4BSD scheduler kernels, specially on dualcore cpus.

ULE on 6.x and is known to have severe performance problems in some
workloads, as well as bugs that cause it to crash.  Use it at your own
peril :)

Kris
_______________________________________________
freebsd-smp@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..."

Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

by Chuck Swiger-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi, Kris--

This was interesting, thanks for putting together the testing and  
graphs.

On Jun 14, 2007, at 1:48 AM, Kris Kennaway wrote:
> I have been benchmarking BIND 9.4.1 recursive query performance on an
> 8-core opteron, using the resperf utility (dns/dnsperf in ports).  The
> query data set was taken from www.freebsd.org's httpd-access.log with
> some of the highly aggressive robot IP addresses pruned out (to avoid
> huge numbers of repeated queries against a small subset of addresses,
> which would skew the results).

It's at least arguable that doing queries against a data set  
including a bunch of repeats is "skewed" in a more realistic  
fashion. :-)  A quick look at some of the data sources I have handy  
such as http access logs or Squid proxy logs suggests that (for  
example) out of a database of 17+ million requests, there were only  
46000 unique IPs involved.

You might find it interesting to compare doing queries against your  
raw and filtered datasets, just to see what kind of difference you  
get, if any.

> Testing was done over a broadcom gigabit ethernet cable connected
> back-to-back between two identical machines.  named was restarted in
> between tests to flush the cache.

What was the external network connectivity in terms of speed?  The  
docs suggest you need something like a 16MBs up/8 Mbs down  
connectivity in order to get up to 50K requests/sec....

[ ... ]
> It would be interesting to test BIND performance when acting as an
> authoritative server, which probably has very different performance
> characteristics; the difficulty there is getting access to a suitably
> interesting and representative zone file and query data.

I suppose you could also set up a test nameserver which claims to be  
authoritative for all of in-addr.arpa, and set up a bunch (65K?) /16  
reverse zone files, and then test against real unmodified IPs, but it  
would be easier to do something like this:

Set up a nameserver which is authoritative for 1.10.in-addr.arpa (ie,  
the reverse zone for 10.1/16), and use a zonefile with the $GENERATE  
directive to populate your PTR records:

$TTL    86400
$origin 1.10.in-addr.arpa.

@       IN      SOA     localhost. hostmaster.localhost. (
         1       ; serial (YYYYMMDD##)
         3h      ; Refresh 3 hours
         1h      ; Retry   1 hour
         30d     ; Expire  30 days
         1d )    ; Minimum 24 hours

@       NS      localhost.

$GENERATE 0-255 $.0 PTR ip-10-1-0-$.example.com.
$GENERATE 0-255 $.1 PTR ip-10-1-1-$.example.org.
$GENERATE 0-255 $.2 PTR ip-10-1-2-$.example.net.
; ...etc...

...and then feed it a query database consisting of PTR lookups.  If  
you wanted to, you could take your existing IP database, and glue the  
last two octets of the real IPs onto 10.1 to produce a reasonable  
assortment of IPs to perform a reverse lookup upon.

--
-Chuck


_______________________________________________
freebsd-smp@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..."

Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

by Kris Kennaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jun 14, 2007 at 04:53:01PM -0700, Chuck Swiger wrote:

> Hi, Kris--
>
> This was interesting, thanks for putting together the testing and  
> graphs.
>
> On Jun 14, 2007, at 1:48 AM, Kris Kennaway wrote:
> >I have been benchmarking BIND 9.4.1 recursive query performance on an
> >8-core opteron, using the resperf utility (dns/dnsperf in ports).  The
> >query data set was taken from www.freebsd.org's httpd-access.log with
> >some of the highly aggressive robot IP addresses pruned out (to avoid
> >huge numbers of repeated queries against a small subset of addresses,
> >which would skew the results).
>
> It's at least arguable that doing queries against a data set  
> including a bunch of repeats is "skewed" in a more realistic  
> fashion. :-)  A quick look at some of the data sources I have handy  
> such as http access logs or Squid proxy logs suggests that (for  
> example) out of a database of 17+ million requests, there were only  
> 46000 unique IPs involved.
There were still lots of repeats, just some of them were repeated
hundreds of thousands of times - I stripped about a dozen of those
(googlebots, I'm looking at you ;-), leaving a distribution that was
less biased to the top end.

> You might find it interesting to compare doing queries against your  
> raw and filtered datasets, just to see what kind of difference you  
> get, if any.

Cached queries perform much better, as you might expect.  As an
estimate I was getting query rates exceeding 120000 qps when serving
entirely out of cache, and I dont think I reached the upper bound yet.

> >Testing was done over a broadcom gigabit ethernet cable connected
> >back-to-back between two identical machines.  named was restarted in
> >between tests to flush the cache.
>
> What was the external network connectivity in terms of speed?  The  
> docs suggest you need something like a 16MBs up/8 Mbs down  
> connectivity in order to get up to 50K requests/sec....

I wasn't seeing anything close to this, so I guess it depends how much
data is being returned by the queries (I was doing PTR lookups).  I
forget the exact numbers but it wasn't exceeding about 10Mbit in both
directions, which should have been well within link capacity.  Also
the lock profiling data bears out the interpretation that it was BIND
that was becoming saturated and not the hardware.

> [ ... ]
> >It would be interesting to test BIND performance when acting as an
> >authoritative server, which probably has very different performance
> >characteristics; the difficulty there is getting access to a suitably
> >interesting and representative zone file and query data.
>
> I suppose you could also set up a test nameserver which claims to be  
> authoritative for all of in-addr.arpa, and set up a bunch (65K?) /16  
> reverse zone files, and then test against real unmodified IPs, but it  
> would be easier to do something like this:
>
> Set up a nameserver which is authoritative for 1.10.in-addr.arpa (ie,  
> the reverse zone for 10.1/16), and use a zonefile with the $GENERATE  
> directive to populate your PTR records:
>
> $TTL    86400
> $origin 1.10.in-addr.arpa.
>
> @       IN      SOA     localhost. hostmaster.localhost. (
>         1       ; serial (YYYYMMDD##)
>         3h      ; Refresh 3 hours
>         1h      ; Retry   1 hour
>         30d     ; Expire  30 days
>         1d )    ; Minimum 24 hours
>
> @       NS      localhost.
>
> $GENERATE 0-255 $.0 PTR ip-10-1-0-$.example.com.
> $GENERATE 0-255 $.1 PTR ip-10-1-1-$.example.org.
> $GENERATE 0-255 $.2 PTR ip-10-1-2-$.example.net.
> ; ...etc...
>
> ...and then feed it a query database consisting of PTR lookups.  If  
> you wanted to, you could take your existing IP database, and glue the  
> last two octets of the real IPs onto 10.1 to produce a reasonable  
> assortment of IPs to perform a reverse lookup upon.
I could construct something like this but I'd prefer a more
"realistic" workload (i.e. an uneven distribution of queries against
different subsets of the data).  I don't have a good idea what
"realistic" means here, which makes it hard to construct one from
scratch.  Fortunately I have an offer from someone for access to a
real large zone file and a large sample of queries.

Kris


attachment0 (194 bytes) Download Attachment

Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0

by Chuck Swiger-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 14, 2007, at 5:03 PM, Kris Kennaway wrote:

>> It's at least arguable that doing queries against a data set
>> including a bunch of repeats is "skewed" in a more realistic
>> fashion. :-)  A quick look at some of the data sources I have handy
>> such as http access logs or Squid proxy logs suggests that (for
>> example) out of a database of 17+ million requests, there were only
>> 46000 unique IPs involved.
>
> There were still lots of repeats, just some of them were repeated
> hundreds of thousands of times - I stripped about a dozen of those
> (googlebots, I'm looking at you ;-), leaving a distribution that was
> less biased to the top end.

Heh, yes, it's surprising how happy a webspider is to crawl around a  
heavily-interlinked site.  :-)

Perhaps someone ought to add a:

   Crawl-delay: 600

...statement to http://www.freebsd.org/robots.txt...?

>> You might find it interesting to compare doing queries against your
>> raw and filtered datasets, just to see what kind of difference you
>> get, if any.
>
> Cached queries perform much better, as you might expect.  As an
> estimate I was getting query rates exceeding 120000 qps when serving
> entirely out of cache, and I dont think I reached the upper bound yet.

Sure, anything cached or anything the nameserver is authoritative for  
is going to be directly answerable without having to do an external  
recursive query.

>> What was the external network connectivity in terms of speed?  The
>> docs suggest you need something like a 16MBs up/8 Mbs down
>> connectivity in order to get up to 50K requests/sec....
>
> I wasn't seeing anything close to this, so I guess it depends how much
> data is being returned by the queries (I was doing PTR lookups).  I
> forget the exact numbers but it wasn't exceeding about 10Mbit in both
> directions, which should have been well within link capacity.  Also
> the lock profiling data bears out the interpretation that it was BIND
> that was becoming saturated and not the hardware.

OK, thanks for the info.  Maybe I'll get a chance to run some numbers  
of my own testing, if I can free up some time from WWDC....

>> [ ... ]
>>> It would be interesting to test BIND performance when acting as an
>>> authoritative server, which probably has very different performance
>>> characteristics; the difficulty there is getting access to a  
>>> suitably
>>> interesting and representative zone file and query data.
>>
>> I suppose you could also set up a test nameserver which claims to be
>> authoritative for all of in-addr.arpa, and set up a bunch (65K?) /16
>> reverse zone files, and then test against real unmodified IPs, but it
>> would be easier to do something like this:
>>
>> Set up a nameserver which is authoritative for 1.10.in-addr.arpa (ie,
>> the reverse zone for 10.1/16), and use a zonefile with the $GENERATE
>> directive to populate your PTR records:
>>
>> [ ...zonefile snipped for brevity... ]
>>
>> ...and then feed it a query database consisting of PTR lookups.  If
>> you wanted to, you could take your existing IP database, and glue the
>> last two octets of the real IPs onto 10.1 to produce a reasonable
>> assortment of IPs to perform a reverse lookup upon.
>
> I could construct something like this but I'd prefer a more
> "realistic" workload (i.e. an uneven distribution of queries against
> different subsets of the data).  I don't have a good idea what
> "realistic" means here, which makes it hard to construct one from
> scratch.  Fortunately I have an offer from someone for access to a
> real large zone file and a large sample of queries.

Ah, very good, then.

While I expect there to be quite a difference between recursive  
queries vs. authoritative/locally answerable queries (after all, that  
seems to be why both dnsperf and resperf were created as distinct  
programs), I'm not convinced that there is too much difference  
between doing reverse lookups for one set of IPs versus another if  
those IPs are all in zones the server is authoritative for.

--
-Chuck



_______________________________________________
freebsd-smp@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "freebsd-smp-unsubscribe@..."