RELENG_7 heavy disk = system crawls

View: New views
2 Messages — Rating Filter:   Alert me  

RELENG_7 heavy disk = system crawls

by grarpamp :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Are you *sure* that the cpu is your bottleneck?

Well, I've got a gig free in /usr/local which is ufs2+softdeps. So
I just dd if=dev/zero bs=1m of=zero there. Disk was 100% busy, cpu
was 10-15% system, 10% user, about 16MiB/sec.

Of course since that is my system spindle, and it was busied out
by dd, I had to wait 10 sec or so for vi to load to write this note
during it :) But the rest of the interface was responsive. Though
I've never run RELENG_4 on this box, I'd venture it would feel
similar in this case.

I can dd if=/dev/ad[n].eli of=/dev/null bs=1m and use 75% system
all in geli, 27% disk busy, 20MiB/sec. Interface was slower but
reasonable.

Now do a dd if=file of=/dev/null bs=1m where file is on zfs on the
geli spindle above and the system melts down. Half cpu in geli,
half spread over about 8 spa_zio's, 4MiB/sec, all system time, disk
100% busy.

I'm not sure yet how to isolate cpu from i/o under my geli+zfs
setup. I think they're mated together.

Curiously, I've got about 104 spa_zio procs all in tq->tq DL state.
Only about 20 or so have more than a little system time on them.
But about 9 zfs mounts so that may be ok, don't know.

Don't get me wrong, FreeBSD has been my primary os since RELENG_2_2.
And it's been great, still is. I recommend and use it all the time.

It's just that this workload has really put the screws to things
and I don't see a way out. I'd like to deploy geli+zfs everywhere
but if I can't login remotely because some user has it busied out
on something I've no knobs to control, umm, yeah :)

I'm curious to see what others running geli_aes128+zfs_sha256 are
seeing in this regard. And I'd love to see what a fast dual or more
core amd64 system would be like under this workload.

As to your i/o thing, I think back in RELENG_4 that if all the
spindles were on the same pata controller/interrupt, monopolistic
loads could occur. Seemed to be a hardware thing, not BSD. IE: At
the moment I've got a half dozen spindles and filesystems spread
out under RELENG_4 all happily doing find | xargs sha1 at once, no
problems. That hardware is set for update to RELENG_7 or RELENG_8
in a few weeks.

> we need a way to nice i/o up/down

That would be handy for sure. User spindles, system spindles,
storage, net, keyboard, etc.


# systime spread
   11 root        1 171 ki31     0K     8K RUN     46.5H 88.18% idle: cpu0
 3215 root        1  -8    -     0K     8K geli:w 737:30  1.76% g_eli[0] ad6
  607 root        1  -8    -     0K     8K geli:w 158:12  0.00% g_eli[0] ad4
 3235 root        1 -16    -     0K    24K tq->tq  69:41  0.00% spa_zio
 3229 root        1 -16    -     0K    24K tq->tq  69:40  0.00% spa_zio
 3228 root        1 -16    -     0K    24K tq->tq  69:39  0.00% spa_zio
 3234 root        1 -16    -     0K    24K tq->tq  69:39  0.00% spa_zio
 3233 root        1 -16    -     0K    24K tq->tq  69:39  0.00% spa_zio
 3232 root        1 -16    -     0K    24K tq->tq  69:39  0.00% spa_zio
 3231 root        1 -16    -     0K    24K tq->tq  69:38  0.00% spa_zio
 3230 root        1 -16    -     0K    24K tq->tq  69:37  0.00% spa_zio
 1135 user        1  44    0   169M   152M select  56:02  0.00% XFree86
  954 root        1 -16    -     0K    24K tq->tq  17:10  0.00% spa_zio
  958 root        1 -16    -     0K    24K tq->tq  17:10  0.00% spa_zio
  956 root        1 -16    -     0K    24K tq->tq  17:09  0.00% spa_zio
  953 root        1 -16    -     0K    24K tq->tq  17:09  0.00% spa_zio
  957 root        1 -16    -     0K    24K tq->tq  17:09  0.00% spa_zio
  952 root        1 -16    -     0K    24K tq->tq  17:09  0.00% spa_zio
  951 root        1 -16    -     0K    24K tq->tq  17:09  0.00% spa_zio
  955 root        1 -16    -     0K    24K tq->tq  17:09  0.00% spa_zio
  613 root        1  -8    -     0K     8K geli:w  16:12  0.00% g_eli[0] ad7
    3 root        1  -8    -     0K     8K -       15:05  0.00% g_up
_______________________________________________
freebsd-performance@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscribe@..."

Re: RELENG_7 heavy disk = system crawls

by dieter-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I can dd if=/dev/ad[n].eli of=/dev/null bs=1m and use 75% system
> all in geli, 27% disk busy, 20MiB/sec. Interface was slower but
> reasonable.

I think I understand now.  You're doing encryption in the kernel,
which eats a lot of cpu, and nice only affects userland.  So yeah
cpu is a significant part of your problem.

> I'm not sure yet how to isolate cpu from i/o under my geli+zfs
> setup. I think they're mated together.

Agreed.

> It's just that this workload has really put the screws to things
> and I don't see a way out. I'd like to deploy geli+zfs everywhere
> but if I can't login remotely because some user has it busied out
> on something I've no knobs to control, umm, yeah :)

Do you *need* geli+zfs ?  If so, you could see if there are any
hardware crypto accellerators with FreeBSD support, or throw
lots of cpu (e.g. phenom2 x4) at it.

> As to your i/o thing, I think back in RELENG_4 that if all the
> spindles were on the same pata controller/interrupt, monopolistic
> loads could occur.

atapci0: <nVidia nForce CK804 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 6.0 on pci0
atapci1: <nVidia nForce CK804 SATA300 controller> port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xcc00-0xcc0f mem 0xfebfb000-0xfebfbfff irq 21 at device 7.0 on pci0
atapci2: <nVidia nForce CK804 SATA300 controller> port 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xb800-0xb80f mem 0xfebfa000-0xfebfafff irq 22 at device 8.0 on pci0
atapci3: <JMicron JMB363 SATA300 controller> port 0x8c00-0x8c07,0x8800-0x8803,0x8400-0x8407,0x8000-0x8003,0x7c00-0x7c0f mem 0xfe9fe000-0xfe9fffff irq 17 at device 0.0 on pci3
atapci4: <SiI SiI 3132 SATA300 controller> port 0x6c00-0x6c7f mem 0xfe6ff000-0xfe6ff07f,0xfe6f8000-0xfe6fbfff irq 16 at device 0.0 on pci4
atapci5: <JMicron JMB363 SATA300 controller> port 0x4c00-0x4c07,0x4800-0x4803,0x4400-0x4407,0x4000-0x4003,0x3c00-0x3c0f mem 0xfe3fe000-0xfe3fffff irq 18 at device 0.0 on pci6

The nForce pata controller doesn't list an irq, seems odd?
_______________________________________________
freebsd-performance@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscribe@..."