ARC size constantly shrinks, then ZFS slows down extremely

View: New views
18 Messages — Rating Filter:   Alert me  

ARC size constantly shrinks, then ZFS slows down extremely

by Attila Nagy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

I'm using FreeBSD 8 (previously 7) on a machine with a lot of disks and
32 GB RAM. With 7.x it ran very well for about 50 days, but suddenly
every operation have slowed down.
gstat showed that the disks are working a lot more than usual the
zpool/zfs was pretty unusable.

I've rebooted the machine then with FreeBSD 8 in the hope the new ZFS
fixes will correct this issue (no 50 days have passed since then, so I
don't know yet) and started to monitor ZFS's statistics.

It seems that after a reboot, the ARC size starts to grow, then
something flips the switch and it changes to shrinking, instead of
maintaining the size.

Please see the pictures here:
http://people.fsn.hu/~bra/freebsd/20090929-zfs-arcsize/

Before the 27th, the machine ran FreeBSD 7, after that date it runs 8.

As you can see, no user process tooks the memory, so I don't know why
the ARC size grows first and then start to decrease.

Could it be that the ARC size decreases such a big amount that it
effectively disappears and this causes the IO activity go up and kill
the machine?

Thanks,
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Attila Nagy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 09/29/09 12:45, Attila Nagy wrote:

> I'm using FreeBSD 8 (previously 7) on a machine with a lot of disks
> and 32 GB RAM. With 7.x it ran very well for about 50 days, but
> suddenly every operation have slowed down.
> gstat showed that the disks are working a lot more than usual the
> zpool/zfs was pretty unusable.
>
> I've rebooted the machine then with FreeBSD 8 in the hope the new ZFS
> fixes will correct this issue (no 50 days have passed since then, so I
> don't know yet) and started to monitor ZFS's statistics.
>
> It seems that after a reboot, the ARC size starts to grow, then
> something flips the switch and it changes to shrinking, instead of
> maintaining the size.
>
> Please see the pictures here:
> http://people.fsn.hu/~bra/freebsd/20090929-zfs-arcsize/
>
> Before the 27th, the machine ran FreeBSD 7, after that date it runs 8.
>
> As you can see, no user process tooks the memory, so I don't know why
> the ARC size grows first and then start to decrease.
>
> Could it be that the ARC size decreases such a big amount that it
> effectively disappears and this causes the IO activity go up and kill
> the machine?
I've upgraded another machine from an older 8-CURRENT to 8-STABLE. It
has low memory (1GB) and it's i386.
The above symptoms can be triggered very easily: if I do an IMAP search
on a lot of mailboxes (which I do regularly), about 10 minutes needed
for the IMAP server to become completely inaccessible.
The machine runs fine, but every operation of the ZFS pool take ages.
According to gstat there is only a very minimal disk activity. The
machine can't even be rebooted, at least not in ten minutes (reboot,
wait 10 minutes, nearly nothing happens, reboot -qn makes the machine
disappear from the net, but it doesn't restart).

Backing out this change from the 8-STABLE kernel:
http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902

makes it survive about half and hour of IMAP searching. Of course only
time will tell whether this helps in the long run, but so far 10/10
tries succeeded to kill the machine with this method...

According to this, I would say that this change makes things worse even
on low memory, i386 (1G RAM) and "there's a plenty of RAM" (32 G) amd64
servers.
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Pawel Jakub Dawidek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
> Backing out this change from the 8-STABLE kernel:
> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902
>
> makes it survive about half and hour of IMAP searching. Of course only
> time will tell whether this helps in the long run, but so far 10/10
> tries succeeded to kill the machine with this method...

Could you try this patch:

        http://people.freebsd.org/~pjd/patches/arc.c.4.patch

--
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@...                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!


attachment0 (194 bytes) Download Attachment

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Artem Belevich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

With the patch, if vfs.zfs.arc_min is set high enough, the system locks up.

On a box with 8G or RAM I had arc_min=6G and arc_max=7G. Once ARC grew
to ~5.8G as reported by kstat.zfs.misc.arcstats.size, number of wired
pages grew to ~7400MB and the processes got stuck in 'vmwait' state. I
had to reboot in order to recover.

On one hand setting arc_min can be considered a pilot error. On the
other, it may be a good idea to allow system to reclaim memory from
ARC even if ARC is smaller than arc_min if the system really really
needs it. The question is how to define "really needs it".

On a side note, it appears that wired page count tends to be
substantially larger than ARC size. I.e. in my case if ARC size grows
to 6G, wired page count is about 1.5G bigger. Perhaps we should allow
reclaiming memory

--Artem



On Fri, Oct 2, 2009 at 11:45 AM, Pawel Jakub Dawidek <pjd@...> wrote:

> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
>> Backing out this change from the 8-STABLE kernel:
>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902
>>
>> makes it survive about half and hour of IMAP searching. Of course only
>> time will tell whether this helps in the long run, but so far 10/10
>> tries succeeded to kill the machine with this method...
>
> Could you try this patch:
>
>        http://people.freebsd.org/~pjd/patches/arc.c.4.patch
>
> --
> Pawel Jakub Dawidek                       http://www.wheel.pl
> pjd@...                           http://www.FreeBSD.org
> FreeBSD committer                         Am I Evil? Yes, I Am!
>
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Pawel Jakub Dawidek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 02, 2009 at 04:38:24PM -0700, Artem Belevich wrote:

> With the patch, if vfs.zfs.arc_min is set high enough, the system locks up.
>
> On a box with 8G or RAM I had arc_min=6G and arc_max=7G. Once ARC grew
> to ~5.8G as reported by kstat.zfs.misc.arcstats.size, number of wired
> pages grew to ~7400MB and the processes got stuck in 'vmwait' state. I
> had to reboot in order to recover.
>
> On one hand setting arc_min can be considered a pilot error. On the
> other, it may be a good idea to allow system to reclaim memory from
> ARC even if ARC is smaller than arc_min if the system really really
> needs it. The question is how to define "really needs it".
>
> On a side note, it appears that wired page count tends to be
> substantially larger than ARC size. I.e. in my case if ARC size grows
> to 6G, wired page count is about 1.5G bigger. Perhaps we should allow
> reclaiming memory
Before we start debuging pathological cases, could you try the patch
with defaul settings? Eventually with vm.kmem_size set to the amount of
RAM you have.

--
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@...                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!


attachment0 (194 bytes) Download Attachment

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Artem Belevich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Before we start debuging pathological cases, could you try the patch
> with defaul settings? Eventually with vm.kmem_size set to the amount of
> RAM you have.

System runs stable/8 r197716 on 4-core amd64 with 8G of RAM.

With default /boot/loader.conf. Kernel comes up with following parameters:

vm.kmem_size: 2764533760
vfs.zfs.arc_min: 215979200
vfs.zfs.arc_max: 1727833600

Under load ARC size reaches ~1.7G. At that time top reports:
Mem: 47M Active, 11M Inact, 2158M Wired, 268K Cache, 21M Buf, 5693M Free

However, as the FS load continues, ARC size, stays at 1.7G for couple
of minutes, then shrinks down to 1.2G, then slowly grows to 1.7G,
stays there for a little and then the shrink/grow cycle repeats.
Throughout the test there's always ~5G of *free* memory.

===============================================================
Now, the same experiment, with vm.kmem_size=8G
vm.kmem_size: 8589934592
vfs.zfs.arc_min: 939524096
vfs.zfs.arc_max: 7516192768

ARC grows to 6.2G:
Mem: 47M Active, 13M Inact, 7376M Wired, 31M Buf, 473M Free

Then it quickly shrinks to 4.6G and grows to 6.2G again, shrinks again, etc..

What's different from the previous case is that after a while ZFS
adjusts target size (kstat.zfs.misc.arcstats.c) down to ~5.8G and
after that ZFS size oscillates between 4.2G and 5.6G. Another
observation -- ARC shrinking happens when system is left with ~512M of
free memory. Yet another observation is that even with ARC peak of
~5.8G, system has about 7.5G wired. Where did almost 2G of difference
go? Fragmentation?

I've tried both experiments with and without L2ARC -- behavior seems
to be the same.

--Artem
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Attila Nagy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 10/02/09 20:45, Pawel Jakub Dawidek wrote:

> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
>  
>> Backing out this change from the 8-STABLE kernel:
>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902
>>
>> makes it survive about half and hour of IMAP searching. Of course only
>> time will tell whether this helps in the long run, but so far 10/10
>> tries succeeded to kill the machine with this method...
>>    
>
> Could you try this patch:
>
> http://people.freebsd.org/~pjd/patches/arc.c.4.patch
>  
Sure. But before that, a report with the above modification: the machine
has survived some days, then started to behave strangely. Meaning I
could ping it, I could log in to the IMAP service (running from ZFS),
read some mails, but not all.
I could not access it via ssh (which runs from UFS), but an already
running top from a different session was alive. It showed:
last pid: 11272;  load averages:  0.00,  0.00,  0.00    up 3+15:21:13  
09:11:43
149 processes: 1 running, 143 sleeping, 1 zombie, 4 waiting
CPU:  0.0% user,  0.0% nice,  0.2% system,  0.0% interrupt, 99.8% idle
Mem: 234M Active, 197M Inact, 559M Wired, 111M Buf, 440K Free
Swap: 4096M Total, 976K Used, 4095M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
78492 root        1  44    0  4700K  2156K CPU1    1   5:37  0.00% top
92343 root        1  44    0  4132K  1576K nanslp  1   4:12  0.00% gstat
13401 root        1  44    0  1528K   456K piperd  0   2:19  0.00%
readproctitl
12679 root        1  44    0  3932K  1236K vmwait  1   2:12  0.00% zpool
35988    125      4  45    0 16892K  5968K sigwai  0   1:53  0.00%
milter-greyl
25656 root        1  45    0  1536K   564K getblk  0   1:45  0.00% supervise
25798 root        1  44    0  1536K   564K vmwait  0   1:44  0.00% supervise
28406 root        1  44    0  1536K   544K vmwait  0   1:43  0.00% supervise
30226 root        1  44    0  1536K   544K vmwait  0   1:43  0.00% supervise
35401 root        1  44    0  1536K   544K vmwait  0   1:42  0.00% supervise
29203 root        1  44    0  1536K   544K vmwait  0   1:42  0.00% supervise
21629    389      6  44    0 91664K 41892K ucond   0   1:02  0.00% slapd
72283     60      1  44    0 80972K  1948K select  1   0:34  0.00% idled
98960 root        1  44    0  9396K  2544K select  1   0:32  0.00% sshd
 1550 root        1  44    0  3340K   940K vmwait  1   0:32  0.00% syslogd
 5463    125      1  44    0  6924K  2036K vmwait  0   0:27  0.00% qmgr
54193 root        1  44    0  9396K  2516K select  0   0:22  0.00% sshd

I could not log into the console, it didn't even gave a "user name"
filed after hitting enter. Strange.

I will try the patch.


_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Artem Belevich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Your lockup is very similar (processes stuck sleeping on vmwait) to
what I had when arc_min was set too high. With Pawel's patch ZFS would
not give up any memory above arc_min.
Try bringing vfs.zfs.arc_min down.

--Artem



2009/10/5 Attila Nagy <bra@...>:

> On 10/02/09 20:45, Pawel Jakub Dawidek wrote:
>>
>> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
>>
>>>
>>> Backing out this change from the 8-STABLE kernel:
>>>
>>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902
>>>
>>> makes it survive about half and hour of IMAP searching. Of course only
>>> time will tell whether this helps in the long run, but so far 10/10 tries
>>> succeeded to kill the machine with this method...
>>>
>>
>> Could you try this patch:
>>
>>        http://people.freebsd.org/~pjd/patches/arc.c.4.patch
>>
>
> Sure. But before that, a report with the above modification: the machine has
> survived some days, then started to behave strangely. Meaning I could ping
> it, I could log in to the IMAP service (running from ZFS), read some mails,
> but not all.
> I could not access it via ssh (which runs from UFS), but an already running
> top from a different session was alive. It showed:
> last pid: 11272;  load averages:  0.00,  0.00,  0.00    up 3+15:21:13
>  09:11:43
> 149 processes: 1 running, 143 sleeping, 1 zombie, 4 waiting
> CPU:  0.0% user,  0.0% nice,  0.2% system,  0.0% interrupt, 99.8% idle
> Mem: 234M Active, 197M Inact, 559M Wired, 111M Buf, 440K Free
> Swap: 4096M Total, 976K Used, 4095M Free
>
>  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> 78492 root        1  44    0  4700K  2156K CPU1    1   5:37  0.00% top
> 92343 root        1  44    0  4132K  1576K nanslp  1   4:12  0.00% gstat
> 13401 root        1  44    0  1528K   456K piperd  0   2:19  0.00%
> readproctitl
> 12679 root        1  44    0  3932K  1236K vmwait  1   2:12  0.00% zpool
> 35988    125      4  45    0 16892K  5968K sigwai  0   1:53  0.00%
> milter-greyl
> 25656 root        1  45    0  1536K   564K getblk  0   1:45  0.00% supervise
> 25798 root        1  44    0  1536K   564K vmwait  0   1:44  0.00% supervise
> 28406 root        1  44    0  1536K   544K vmwait  0   1:43  0.00% supervise
> 30226 root        1  44    0  1536K   544K vmwait  0   1:43  0.00% supervise
> 35401 root        1  44    0  1536K   544K vmwait  0   1:42  0.00% supervise
> 29203 root        1  44    0  1536K   544K vmwait  0   1:42  0.00% supervise
> 21629    389      6  44    0 91664K 41892K ucond   0   1:02  0.00% slapd
> 72283     60      1  44    0 80972K  1948K select  1   0:34  0.00% idled
> 98960 root        1  44    0  9396K  2544K select  1   0:32  0.00% sshd
> 1550 root        1  44    0  3340K   940K vmwait  1   0:32  0.00% syslogd
> 5463    125      1  44    0  6924K  2036K vmwait  0   0:27  0.00% qmgr
> 54193 root        1  44    0  9396K  2516K select  0   0:22  0.00% sshd
>
> I could not log into the console, it didn't even gave a "user name" filed
> after hitting enter. Strange.
>
> I will try the patch.
>
>
> _______________________________________________
> freebsd-fs@... mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."
>
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Artem Belevich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I left my box running over the weekend (vm.kmem_size=8G) in a loop
doing a build and then deleting build results. Each loop cycle takes
about 5 hours. All in all it touches about 2GB of sources and produces
about 30GB of object files and stuff.

This morning ARC size is around 2.5G. Now and then it dips down to 1G.
I've attached graph with memory stats and ARC size.

--Artem

> ===============================================================
> Now, the same experiment, with vm.kmem_size=8G
> vm.kmem_size: 8589934592
> vfs.zfs.arc_min: 939524096
> vfs.zfs.arc_max: 7516192768
>
> ARC grows to 6.2G:
> Mem: 47M Active, 13M Inact, 7376M Wired, 31M Buf, 473M Free
>
> Then it quickly shrinks to 4.6G and grows to 6.2G again, shrinks again, etc..
>
> What's different from the previous case is that after a while ZFS
> adjusts target size (kstat.zfs.misc.arcstats.c) down to ~5.8G and
> after that ZFS size oscillates between 4.2G and 5.6G. Another
> observation -- ARC shrinking happens when system is left with ~512M of
> free memory. Yet another observation is that even with ARC peak of
> ~5.8G, system has about 7.5G wired. Where did almost 2G of difference
> go? Fragmentation?
>
> I've tried both experiments with and without L2ARC -- behavior seems
> to be the same.
>
> --Artem
>

_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Attila Nagy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

Pawel Jakub Dawidek wrote:

> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
>  
>> Backing out this change from the 8-STABLE kernel:
>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902
>>
>> makes it survive about half and hour of IMAP searching. Of course only
>> time will tell whether this helps in the long run, but so far 10/10
>> tries succeeded to kill the machine with this method...
>>    
>
> Could you try this patch:
>
> http://people.freebsd.org/~pjd/patches/arc.c.4.patch
>  
It seems (after running for two days) that this fixes my problem. And I
see that Kip has came out with a similar version (which I couldn't yet
test, but hope that will also do).

Thanks!
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Attila Nagy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Attila Nagy wrote:

> Hello,
>
> Pawel Jakub Dawidek wrote:
>> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
>>  
>>> Backing out this change from the 8-STABLE kernel:
>>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902 
>>>
>>>
>>> makes it survive about half and hour of IMAP searching. Of course
>>> only time will tell whether this helps in the long run, but so far
>>> 10/10 tries succeeded to kill the machine with this method...
>>>    
>>
>> Could you try this patch:
>>
>>     http://people.freebsd.org/~pjd/patches/arc.c.4.patch
>>  
> It seems (after running for two days) that this fixes my problem. And
> I see that Kip has came out with a similar version (which I couldn't
> yet test, but hope that will also do).
It seems that I was a little bit quick regarding this.
The machine just stopped with this:
last pid: 32358;  load averages:  0.01,  0.04,  0.12    up 2+06:33:56  
14:36:25
114 processes: 1 running, 112 sleeping, 1 zombie
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf
Swap: 4096M Total, 15M Used, 4081M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
24025 root        1  44    0  3932K   992K vmwait  0   6:06  0.00% zpool
84190 root        1  44    0  4700K  1592K CPU1    1   4:17  0.00% top
99029 root        1  44    0  4132K  1212K nanslp  1   3:53  0.00% gstat
26317 root        1  44    0  1528K   352K piperd  1   3:38  0.00%
readproctitl
49143    125      4  45    0 12248K  3788K sigwai  0   2:50  0.00%
milter-greyl
39969 root        1  44    0  1536K   516K vmwait  0   2:50  0.00% supervise
40241 root        1  44    0  1536K   516K vmwait  0   2:47  0.00% supervise
44633 root        1  44    0  1536K   512K vmwait  0   2:43  0.00% supervise
43434 root        1  44    0  1536K   516K vmwait  0   2:43  0.00% supervise
50575 root        1  44    0  1536K   516K vmwait  0   2:42  0.00% supervise
45510 root        1  44    0  1536K   512K vmwait  0   2:42  0.00% supervise
58146     60      1  44    0   264M  8828K pfault  0   2:32  0.00% imapd
47526    389      6  44    0 92688K  2296K ucond   1   1:29  0.00% slapd
 5417 root        1  44    0  9396K  1680K pfault  1   1:26  0.00% sshd
13147 root        1  44    0  3340K   860K vmwait  1   0:45  0.00% syslogd
92597 root        1  44    0  9396K  1676K pfault  1   0:39  0.00% sshd
26437    125      1  44    0  6924K  1700K vmwait  0   0:33  0.00% qmgr

The above top was refreshing, but every other stuff on different ssh
consoles (like a running zpool iostat and gstat) was frozen.
Even top stopped when I have resized the window.

_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Pawel Jakub Dawidek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Oct 08, 2009 at 02:45:04PM +0200, Attila Nagy wrote:

> Attila Nagy wrote:
> >Hello,
> >
> >Pawel Jakub Dawidek wrote:
> >>On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
> >>
> >>>Backing out this change from the 8-STABLE kernel:
> >>>http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902 
> >>>
> >>>
> >>>makes it survive about half and hour of IMAP searching. Of course
> >>>only time will tell whether this helps in the long run, but so far
> >>>10/10 tries succeeded to kill the machine with this method...
> >>>    
> >>
> >>Could you try this patch:
> >>
> >>    http://people.freebsd.org/~pjd/patches/arc.c.4.patch
> >>  
> >It seems (after running for two days) that this fixes my problem. And
> >I see that Kip has came out with a similar version (which I couldn't
> >yet test, but hope that will also do).
> It seems that I was a little bit quick regarding this.
> The machine just stopped with this:
> last pid: 32358;  load averages:  0.01,  0.04,  0.12    up 2+06:33:56  
> 14:36:25
> 114 processes: 1 running, 112 sleeping, 1 zombie
> CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf
> Swap: 4096M Total, 15M Used, 4081M Free
>
>  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> 24025 root        1  44    0  3932K   992K vmwait  0   6:06  0.00% zpool
> 84190 root        1  44    0  4700K  1592K CPU1    1   4:17  0.00% top
> 99029 root        1  44    0  4132K  1212K nanslp  1   3:53  0.00% gstat
> 26317 root        1  44    0  1528K   352K piperd  1   3:38  0.00%
> readproctitl
> 49143    125      4  45    0 12248K  3788K sigwai  0   2:50  0.00%
> milter-greyl
> 39969 root        1  44    0  1536K   516K vmwait  0   2:50  0.00% supervise
> 40241 root        1  44    0  1536K   516K vmwait  0   2:47  0.00% supervise
> 44633 root        1  44    0  1536K   512K vmwait  0   2:43  0.00% supervise
> 43434 root        1  44    0  1536K   516K vmwait  0   2:43  0.00% supervise
> 50575 root        1  44    0  1536K   516K vmwait  0   2:42  0.00% supervise
> 45510 root        1  44    0  1536K   512K vmwait  0   2:42  0.00% supervise
> 58146     60      1  44    0   264M  8828K pfault  0   2:32  0.00% imapd
> 47526    389      6  44    0 92688K  2296K ucond   1   1:29  0.00% slapd
> 5417 root        1  44    0  9396K  1680K pfault  1   1:26  0.00% sshd
> 13147 root        1  44    0  3340K   860K vmwait  1   0:45  0.00% syslogd
> 92597 root        1  44    0  9396K  1676K pfault  1   0:39  0.00% sshd
> 26437    125      1  44    0  6924K  1700K vmwait  0   0:33  0.00% qmgr
>
> The above top was refreshing, but every other stuff on different ssh
> consoles (like a running zpool iostat and gstat) was frozen.
> Even top stopped when I have resized the window.
Please try Kip's patch that was committed, it changes priorities a bit,
which should help.

--
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@...                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!


attachment0 (194 bytes) Download Attachment

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Artem Belevich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've tested with Kip's patch -- no lockups so far.

--Artem



On Thu, Oct 8, 2009 at 9:07 AM, Pawel Jakub Dawidek <pjd@...> wrote:

> On Thu, Oct 08, 2009 at 02:45:04PM +0200, Attila Nagy wrote:
>> Attila Nagy wrote:
>> >Hello,
>> >
>> >Pawel Jakub Dawidek wrote:
>> >>On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
>> >>
>> >>>Backing out this change from the 8-STABLE kernel:
>> >>>http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902
>> >>>
>> >>>
>> >>>makes it survive about half and hour of IMAP searching. Of course
>> >>>only time will tell whether this helps in the long run, but so far
>> >>>10/10 tries succeeded to kill the machine with this method...
>> >>>
>> >>
>> >>Could you try this patch:
>> >>
>> >>    http://people.freebsd.org/~pjd/patches/arc.c.4.patch
>> >>
>> >It seems (after running for two days) that this fixes my problem. And
>> >I see that Kip has came out with a similar version (which I couldn't
>> >yet test, but hope that will also do).
>> It seems that I was a little bit quick regarding this.
>> The machine just stopped with this:
>> last pid: 32358;  load averages:  0.01,  0.04,  0.12    up 2+06:33:56
>> 14:36:25
>> 114 processes: 1 running, 112 sleeping, 1 zombie
>> CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
>> Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf
>> Swap: 4096M Total, 15M Used, 4081M Free
>>
>>  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
>> 24025 root        1  44    0  3932K   992K vmwait  0   6:06  0.00% zpool
>> 84190 root        1  44    0  4700K  1592K CPU1    1   4:17  0.00% top
>> 99029 root        1  44    0  4132K  1212K nanslp  1   3:53  0.00% gstat
>> 26317 root        1  44    0  1528K   352K piperd  1   3:38  0.00%
>> readproctitl
>> 49143    125      4  45    0 12248K  3788K sigwai  0   2:50  0.00%
>> milter-greyl
>> 39969 root        1  44    0  1536K   516K vmwait  0   2:50  0.00% supervise
>> 40241 root        1  44    0  1536K   516K vmwait  0   2:47  0.00% supervise
>> 44633 root        1  44    0  1536K   512K vmwait  0   2:43  0.00% supervise
>> 43434 root        1  44    0  1536K   516K vmwait  0   2:43  0.00% supervise
>> 50575 root        1  44    0  1536K   516K vmwait  0   2:42  0.00% supervise
>> 45510 root        1  44    0  1536K   512K vmwait  0   2:42  0.00% supervise
>> 58146     60      1  44    0   264M  8828K pfault  0   2:32  0.00% imapd
>> 47526    389      6  44    0 92688K  2296K ucond   1   1:29  0.00% slapd
>> 5417 root        1  44    0  9396K  1680K pfault  1   1:26  0.00% sshd
>> 13147 root        1  44    0  3340K   860K vmwait  1   0:45  0.00% syslogd
>> 92597 root        1  44    0  9396K  1676K pfault  1   0:39  0.00% sshd
>> 26437    125      1  44    0  6924K  1700K vmwait  0   0:33  0.00% qmgr
>>
>> The above top was refreshing, but every other stuff on different ssh
>> consoles (like a running zpool iostat and gstat) was frozen.
>> Even top stopped when I have resized the window.
>
> Please try Kip's patch that was committed, it changes priorities a bit,
> which should help.
>
> --
> Pawel Jakub Dawidek                       http://www.wheel.pl
> pjd@...                           http://www.FreeBSD.org
> FreeBSD committer                         Am I Evil? Yes, I Am!
>
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Attila Nagy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Pawel Jakub Dawidek wrote:

> On Thu, Oct 08, 2009 at 02:45:04PM +0200, Attila Nagy wrote:
>  
>> Attila Nagy wrote:
>>    
>>> Hello,
>>>
>>> Pawel Jakub Dawidek wrote:
>>>      
>>>> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote:
>>>>
>>>>        
>>>>> Backing out this change from the 8-STABLE kernel:
>>>>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902 
>>>>>
>>>>>
>>>>> makes it survive about half and hour of IMAP searching. Of course
>>>>> only time will tell whether this helps in the long run, but so far
>>>>> 10/10 tries succeeded to kill the machine with this method...
>>>>>    
>>>>>          
>>>> Could you try this patch:
>>>>
>>>>    http://people.freebsd.org/~pjd/patches/arc.c.4.patch
>>>>  
>>>>        
>>> It seems (after running for two days) that this fixes my problem. And
>>> I see that Kip has came out with a similar version (which I couldn't
>>> yet test, but hope that will also do).
>>>      
>> It seems that I was a little bit quick regarding this.
>> The machine just stopped with this:
>> last pid: 32358;  load averages:  0.01,  0.04,  0.12    up 2+06:33:56  
>> 14:36:25
>> 114 processes: 1 running, 112 sleeping, 1 zombie
>> CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
>> Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf
>> Swap: 4096M Total, 15M Used, 4081M Free
>>
>>  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
>> 24025 root        1  44    0  3932K   992K vmwait  0   6:06  0.00% zpool
>> 84190 root        1  44    0  4700K  1592K CPU1    1   4:17  0.00% top
>> 99029 root        1  44    0  4132K  1212K nanslp  1   3:53  0.00% gstat
>> 26317 root        1  44    0  1528K   352K piperd  1   3:38  0.00%
>> readproctitl
>> 49143    125      4  45    0 12248K  3788K sigwai  0   2:50  0.00%
>> milter-greyl
>> 39969 root        1  44    0  1536K   516K vmwait  0   2:50  0.00% supervise
>> 40241 root        1  44    0  1536K   516K vmwait  0   2:47  0.00% supervise
>> 44633 root        1  44    0  1536K   512K vmwait  0   2:43  0.00% supervise
>> 43434 root        1  44    0  1536K   516K vmwait  0   2:43  0.00% supervise
>> 50575 root        1  44    0  1536K   516K vmwait  0   2:42  0.00% supervise
>> 45510 root        1  44    0  1536K   512K vmwait  0   2:42  0.00% supervise
>> 58146     60      1  44    0   264M  8828K pfault  0   2:32  0.00% imapd
>> 47526    389      6  44    0 92688K  2296K ucond   1   1:29  0.00% slapd
>> 5417 root        1  44    0  9396K  1680K pfault  1   1:26  0.00% sshd
>> 13147 root        1  44    0  3340K   860K vmwait  1   0:45  0.00% syslogd
>> 92597 root        1  44    0  9396K  1676K pfault  1   0:39  0.00% sshd
>> 26437    125      1  44    0  6924K  1700K vmwait  0   0:33  0.00% qmgr
>>
>> The above top was refreshing, but every other stuff on different ssh
>> consoles (like a running zpool iostat and gstat) was frozen.
>> Even top stopped when I have resized the window.
>>    
>
> Please try Kip's patch that was committed, it changes priorities a bit,
> which should help.
>  
My i386 machine is still alive after two days of uptime (with your
patch, it lived for about two days, so I can't say -at least now- that
it's OK).

The amd64 machine started to loose ARC memory again. See these:
http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/zfs_mem-week.png
http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/memory-week.png

Your patch was active between 7 and 9. You can see that the ARC size was
somewhat constant.
On october 9, I installed Kip's modification, and ARC size started to
decrease.
BTW, previously (before october 7) I set the arc min size to 10-15GB
(can't remember the exact value), but now it runs with the defaults
(only the max size is set):
vfs.zfs.arc_min: 3623878656
vfs.zfs.arc_max: 28991029248

As you can see, there are plenty of memory. This machine uses UFS as
well (and writes it heavily), maybe that's what affects ZFS size, by
caching a lot of stuff?

_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Kip Macy-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>
> The amd64 machine started to loose ARC memory again. See these:
> http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/zfs_mem- 
> week.png
> http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/memory-week.png
>
> Your patch was active between 7 and 9. You can see that the ARC size  
> was somewhat constant.
> On october 9, I installed Kip's modification, and ARC size started  
> to decrease.
> BTW, previously (before october 7) I set the arc min size to 10-15GB  
> (can't remember the exact value), but now it runs with the defaults  
> (only the max size is set):
> vfs.zfs.arc_min: 3623878656
> vfs.zfs.arc_max: 28991029248
>
> As you can see, there are plenty of memory. This machine uses UFS as  
> well (and writes it heavily), maybe that's what affects ZFS size, by  
> caching a lot of stuff?
>

Currently, the inactive page queue will grow until ARC is shrunk to  
arc_min.


I think I'll probably spend some time making the ARC play better with  
the page cache this week. Unfortunately, under heavy memory pressure  
when competing with UFS the ARC will degrade to LRU, but I think that  
is still an improvement over the current static sizing with low and  
high water marks.

-Kip
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Attila Nagy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

K. Macy wrote:

>>
>> The amd64 machine started to loose ARC memory again. See these:
>> http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/zfs_mem-week.png
>> http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/memory-week.png
>>
>> Your patch was active between 7 and 9. You can see that the ARC size
>> was somewhat constant.
>> On october 9, I installed Kip's modification, and ARC size started to
>> decrease.
>> BTW, previously (before october 7) I set the arc min size to 10-15GB
>> (can't remember the exact value), but now it runs with the defaults
>> (only the max size is set):
>> vfs.zfs.arc_min: 3623878656
>> vfs.zfs.arc_max: 28991029248
>>
>> As you can see, there are plenty of memory. This machine uses UFS as
>> well (and writes it heavily), maybe that's what affects ZFS size, by
>> caching a lot of stuff?
>>
>
> Currently, the inactive page queue will grow until ARC is shrunk to
> arc_min.
>
>
> I think I'll probably spend some time making the ARC play better with
> the page cache this week. Unfortunately, under heavy memory pressure
> when competing with UFS the ARC will degrade to LRU, but I think that
> is still an improvement over the current static sizing with low and
> high water marks.
Will setting ARC's minimum size help here? (I will try)

For me, it's OK, and I think it's generally not a problem, if somebody
uses UFS as well.
Is it possible to merge the memory management of the two, or they are
completely different beasts?

Thanks,

_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Kip Macy-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>
>> Currently, the inactive page queue will grow until ARC is shrunk to  
>> arc_min.
>>
>> I think I'll probably spend some time making the ARC play better  
>> with the page cache this week. Unfortunately, under heavy memory  
>> pressure when competing with UFS the ARC will degrade to LRU, but I  
>> think that is still an improvement over the current static sizing  
>> with low and high water marks.
> Will setting ARC's minimum size help here? (I will try)
>
> For me, it's OK, and I think it's generally not a problem, if  
> somebody uses UFS as well.
> Is it possible to merge the memory management of the two, or they  
> are completely different beasts?

To some degree it is possible to merge them by partly backing the arc  
from the page cache. This would allow for a fair amount of auto-
tuning. However, it isn't possibly to completely merge them - the ARC  
is a virtual device block cache, UFS caches pages in the vm object  
based on their offset in the file. Thus it would never be possible to  
use blocks in the ARC for mmap - for applications that dirty file  
backed mmaped memory it will always be necessary to have two copies of  
the page, one in the vm object for the file and one in ZFS that maps  
to a block offset. It all makes a bit more sense if you understand  
that ZFS is a transactional object store with a posix file system  
interface.
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."

Re: ARC size constantly shrinks, then ZFS slows down extremely

by Artem Belevich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Will r197816 from head be MFC'd to stable-8 as commit comment
suggested? I've been running my box with the r197816 applied to
stable/8 and so far it's been working fine for me.

#svn log -v -c 197816
------------------------------------------------------------------------
r197816 | kmacy | 2009-10-06 14:40:50 -0700 (Tue, 06 Oct 2009) | 5 lines
Changed paths:
   M /head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c

Prevent paging pressure from draining arc too much
- always drain arc if above arc_c_max - never drain arc if arc is
below arc_c_max

MFC after: 3 days

------------------------------------------------------------------------

--Artem
_______________________________________________
freebsd-fs@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@..."