|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
|
|
|
Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)...On Sun, Jun 22, 2008 at 7:14 PM, Mel Gorman <mel@...> wrote:
> On (22/06/08 10:58), Daniel J Blueman didst pronounce: >> I'm seeing a similar issue [2] to what was recently reported [1] by >> Alexander, but with another workload involving XFS and memory >> pressure. >> > > Is NFS involved or is this XFS only? It looks like XFS-only but no harm in > being sure. The application is reading a 10MB file from NFS around every few seconds, and writing back ~30MB every few seconds to local XFS; another thread in the same application is consuming that data and writing out ~2MB to NFS after circa 10 input files, so NFS isn't dominant but is involved. > I'm beginning to wonder if this is a problem where a lot of dirty inodes are > being written back in this path and we stall while that happens. I'm still > not getting why we are triggering this now and did not before 2.6.26-rc1 > or why it bisects to the zonelist modifications. Diffing the reclaim and > allocation paths between 2.6.25 and 2.6.26-rc1 has not yielded any candidates > for me yet that would explain this. > >> SLUB allocator is in use and config is at http://quora.org/config-client-debug . >> >> Let me know if you'd like more details/vmlinux objdump etc. >> >> Thanks, >> Daniel >> >> --- [1] >> >> http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/e673c9173d45a735/db9213ef39e4e11c >> >> --- [2] >> >> ======================================================= >> [ INFO: possible circular locking dependency detected ] >> 2.6.26-rc7-210c #2 >> ------------------------------------------------------- >> AutopanoPro/4470 is trying to acquire lock: >> (iprune_mutex){--..}, at: [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290 >> >> but task is already holding lock: >> (&mm->mmap_sem){----}, at: [<ffffffff805e3e15>] do_page_fault+0x255/0x890 >> >> which lock already depends on the new lock. >> >> >> the existing dependency chain (in reverse order) is: >> >> -> #2 (&mm->mmap_sem){----}: >> [<ffffffff80278f4d>] __lock_acquire+0xbdd/0x1020 >> [<ffffffff802793f5>] lock_acquire+0x65/0x90 >> [<ffffffff805df5ab>] down_read+0x3b/0x70 >> [<ffffffff805e3e3c>] do_page_fault+0x27c/0x890 >> [<ffffffff805e16cd>] error_exit+0x0/0xa9 >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> -> #1 (&(&ip->i_iolock)->mr_lock){----}: >> [<ffffffff80278f4d>] __lock_acquire+0xbdd/0x1020 >> [<ffffffff802793f5>] lock_acquire+0x65/0x90 >> [<ffffffff8026d746>] down_write_nested+0x46/0x80 >> [<ffffffff8039df29>] xfs_ilock+0x99/0xa0 >> [<ffffffff8039e0cf>] xfs_ireclaim+0x3f/0x90 >> [<ffffffff803ba889>] xfs_finish_reclaim+0x59/0x1a0 >> [<ffffffff803bc199>] xfs_reclaim+0x109/0x110 >> [<ffffffff803c9541>] xfs_fs_clear_inode+0xe1/0x110 >> [<ffffffff802d906d>] clear_inode+0x7d/0x110 >> [<ffffffff802d93aa>] dispose_list+0x2a/0x100 >> [<ffffffff802d96af>] shrink_icache_memory+0x22f/0x290 >> [<ffffffff8029d868>] shrink_slab+0x168/0x1d0 >> [<ffffffff8029e0b6>] kswapd+0x3b6/0x560 >> [<ffffffff8026921d>] kthread+0x4d/0x80 >> [<ffffffff80227428>] child_rip+0xa/0x12 >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> -> #0 (iprune_mutex){--..}: >> [<ffffffff80278db7>] __lock_acquire+0xa47/0x1020 >> [<ffffffff802793f5>] lock_acquire+0x65/0x90 >> [<ffffffff805dedd5>] mutex_lock_nested+0xb5/0x300 >> [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290 >> [<ffffffff8029d868>] shrink_slab+0x168/0x1d0 >> [<ffffffff8029db38>] try_to_free_pages+0x268/0x3a0 >> [<ffffffff802979d6>] __alloc_pages_internal+0x206/0x4b0 >> [<ffffffff80297c89>] __alloc_pages_nodemask+0x9/0x10 >> [<ffffffff802b2bc2>] alloc_page_vma+0x72/0x1b0 >> [<ffffffff802a3642>] handle_mm_fault+0x462/0x7b0 >> [<ffffffff805e3ecc>] do_page_fault+0x30c/0x890 >> [<ffffffff805e16cd>] error_exit+0x0/0xa9 >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> other info that might help us debug this: >> >> 2 locks held by AutopanoPro/4470: >> #0: (&mm->mmap_sem){----}, at: [<ffffffff805e3e15>] do_page_fault+0x255/0x890 >> #1: (shrinker_rwsem){----}, at: [<ffffffff8029d732>] shrink_slab+0x32/0x1d0 >> >> stack backtrace: >> Pid: 4470, comm: AutopanoPro Not tainted 2.6.26-rc7-210c #2 >> >> Call Trace: >> [<ffffffff80276823>] print_circular_bug_tail+0x83/0x90 >> [<ffffffff80275e09>] ? print_circular_bug_entry+0x49/0x60 >> [<ffffffff80278db7>] __lock_acquire+0xa47/0x1020 >> [<ffffffff802793f5>] lock_acquire+0x65/0x90 >> [<ffffffff802d94fd>] ? shrink_icache_memory+0x7d/0x290 >> [<ffffffff805dedd5>] mutex_lock_nested+0xb5/0x300 >> [<ffffffff802d94fd>] ? shrink_icache_memory+0x7d/0x290 >> [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290 >> [<ffffffff8029d732>] ? shrink_slab+0x32/0x1d0 >> [<ffffffff8029d868>] shrink_slab+0x168/0x1d0 >> [<ffffffff8029db38>] try_to_free_pages+0x268/0x3a0 >> [<ffffffff8029c240>] ? isolate_pages_global+0x0/0x40 >> [<ffffffff802979d6>] __alloc_pages_internal+0x206/0x4b0 >> [<ffffffff80297c89>] __alloc_pages_nodemask+0x9/0x10 >> [<ffffffff802b2bc2>] alloc_page_vma+0x72/0x1b0 >> [<ffffffff802a3642>] handle_mm_fault+0x462/0x7b0 >> [<ffffffff80277e2f>] ? trace_hardirqs_on+0xbf/0x150 >> [<ffffffff805e3e15>] ? do_page_fault+0x255/0x890 >> [<ffffffff805e3ecc>] do_page_fault+0x30c/0x890 >> [<ffffffff805e16cd>] error_exit+0x0/0xa9 >> -- >> Daniel J Blueman >> > > -- > Mel Gorman > Part-time Phd Student Linux Technology Center > University of Limerick IBM Dublin Software Lab Daniel J Blueman |
|
|
|
|
|
|
|
|
Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)...On (23/06/08 08:19), Dave Chinner didst pronounce:
> [added xfs@... to cc] > > On Sun, Jun 22, 2008 at 10:58:56AM +0100, Daniel J Blueman wrote: > > I'm seeing a similar issue [2] to what was recently reported [1] by > > Alexander, but with another workload involving XFS and memory > > pressure. > > > > SLUB allocator is in use and config is at http://quora.org/config-client-debug . > > > > Let me know if you'd like more details/vmlinux objdump etc. > > > > Thanks, > > Daniel > > > > --- [1] > > > > http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/e673c9173d45a735/db9213ef39e4e11c > > > > --- [2] > > > > ======================================================= > > [ INFO: possible circular locking dependency detected ] > > 2.6.26-rc7-210c #2 > > ------------------------------------------------------- > > AutopanoPro/4470 is trying to acquire lock: > > (iprune_mutex){--..}, at: [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290 > > > > but task is already holding lock: > > (&mm->mmap_sem){----}, at: [<ffffffff805e3e15>] do_page_fault+0x255/0x890 > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #2 (&mm->mmap_sem){----}: > > [<ffffffff80278f4d>] __lock_acquire+0xbdd/0x1020 > > [<ffffffff802793f5>] lock_acquire+0x65/0x90 > > [<ffffffff805df5ab>] down_read+0x3b/0x70 > > [<ffffffff805e3e3c>] do_page_fault+0x27c/0x890 > > [<ffffffff805e16cd>] error_exit+0x0/0xa9 > > [<ffffffffffffffff>] 0xffffffffffffffff > > > > -> #1 (&(&ip->i_iolock)->mr_lock){----}: > > [<ffffffff80278f4d>] __lock_acquire+0xbdd/0x1020 > > [<ffffffff802793f5>] lock_acquire+0x65/0x90 > > [<ffffffff8026d746>] down_write_nested+0x46/0x80 > > [<ffffffff8039df29>] xfs_ilock+0x99/0xa0 > > [<ffffffff8039e0cf>] xfs_ireclaim+0x3f/0x90 > > [<ffffffff803ba889>] xfs_finish_reclaim+0x59/0x1a0 > > [<ffffffff803bc199>] xfs_reclaim+0x109/0x110 > > [<ffffffff803c9541>] xfs_fs_clear_inode+0xe1/0x110 > > [<ffffffff802d906d>] clear_inode+0x7d/0x110 > > [<ffffffff802d93aa>] dispose_list+0x2a/0x100 > > [<ffffffff802d96af>] shrink_icache_memory+0x22f/0x290 > > [<ffffffff8029d868>] shrink_slab+0x168/0x1d0 > > [<ffffffff8029e0b6>] kswapd+0x3b6/0x560 > > [<ffffffff8026921d>] kthread+0x4d/0x80 > > [<ffffffff80227428>] child_rip+0xa/0x12 > > [<ffffffffffffffff>] 0xffffffffffffffff > > You may as well ignore anything invlving this path in XFS until > lockdep gets fixed. The kswapd reclaim path is inverted over the > synchronous reclaim path that is xfs_ilock -> run out of memory -> > prune_icache and then potentially another -> xfs_ilock. > In that case, have you any theory as to why this circular dependency is being reported now but wasn't before 2.6.26-rc1? I'm beginning to wonder if the bisecting fingering the zonelist modifiation is just a co-incidence. Also, do you think the stalls were happening before but just not being noticed? > In this case, XFS can *never* deadlock because the second xfs_ilock > is on a different, unreferenced, unlocked inode, but without turning > off lockdep there is nothing in XFS that can be done to prevent > this warning. > > Therxp eis a similar bug in the VM w.r.t the mmap_sem in that the > mmap_sem is held across a call to put_filp() which can result in > inversions between the xfs_ilock and mmap_sem. > > Both of these cases cannot be solved by changing XFS - lockdep > needs to be made aware of paths that can invert normal locking > order (like prune_icache) so it doesn't give false positives > like this. > > > -> #0 (iprune_mutex){--..}: > > [<ffffffff80278db7>] __lock_acquire+0xa47/0x1020 > > [<ffffffff802793f5>] lock_acquire+0x65/0x90 > > [<ffffffff805dedd5>] mutex_lock_nested+0xb5/0x300 > > [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290 > > [<ffffffff8029d868>] shrink_slab+0x168/0x1d0 > > [<ffffffff8029db38>] try_to_free_pages+0x268/0x3a0 > > [<ffffffff802979d6>] __alloc_pages_internal+0x206/0x4b0 > > [<ffffffff80297c89>] __alloc_pages_nodemask+0x9/0x10 > > [<ffffffff802b2bc2>] alloc_page_vma+0x72/0x1b0 > > [<ffffffff802a3642>] handle_mm_fault+0x462/0x7b0 > > [<ffffffff805e3ecc>] do_page_fault+0x30c/0x890 > > [<ffffffff805e16cd>] error_exit+0x0/0xa9 > > [<ffffffffffffffff>] 0xffffffffffffffff > > This case is different in that it ??s complaining about mmap_sem vs > iprune_mutex, so I think that we can pretty much ignore the XFS side > of things here - the problem is higher level code.... > > > [<ffffffff8029db38>] try_to_free_pages+0x268/0x3a0 > > [<ffffffff8029c240>] ? isolate_pages_global+0x0/0x40 > > [<ffffffff802979d6>] __alloc_pages_internal+0x206/0x4b0 > > [<ffffffff80297c89>] __alloc_pages_nodemask+0x9/0x10 > > [<ffffffff802b2bc2>] alloc_page_vma+0x72/0x1b0 > > [<ffffffff802a3642>] handle_mm_fault+0x462/0x7b0 > > FWIW, should page allocation in a page fault be allowed to recurse > into the filesystem? If I follow the spaghetti of inline and > compiler inlined functions correctly, this is a GFP_HIGHUSER_MOVABLE > allocation, right? Should we be allowing shrink_icache_memory() > to be called at all in the page fault path? > Well, the page fault path is able to go to sleep and can enter direct reclaim under low memory situations. Right now, I'm failing to see why a page fault should not be allowed to reclaim pages in use by a filesystem. It was allowed before so the question still is why the circular lock warning appears now but didn't before. > Cheers, > > Dave. > -- > Dave Chinner > david@... > -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab |
|
|
Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)...On Mon, Jun 23, 2008 at 01:24:15AM +0100, Mel Gorman wrote:
> On (23/06/08 08:19), Dave Chinner didst pronounce: > > [added xfs@... to cc] > > > > On Sun, Jun 22, 2008 at 10:58:56AM +0100, Daniel J Blueman wrote: > > > I'm seeing a similar issue [2] to what was recently reported [1] by > > > Alexander, but with another workload involving XFS and memory > > > pressure. [....] > > You may as well ignore anything invlving this path in XFS until > > lockdep gets fixed. The kswapd reclaim path is inverted over the > > synchronous reclaim path that is xfs_ilock -> run out of memory -> > > prune_icache and then potentially another -> xfs_ilock. > > > > In that case, have you any theory as to why this circular dependency is > being reported now but wasn't before 2.6.26-rc1? I'm beginning to wonder > if the bisecting fingering the zonelist modifiation is just a > co-incidence. Probably co-incidence. Perhaps it's simply changed the way reclaim is behaving and we are more likely to be trimming slab caches instead of getting free pages from the page lists? > Also, do you think the stalls were happening before but just not > being noticed? Entirely possible, I think, but I know of no evidence one way or another. [....] > > FWIW, should page allocation in a page fault be allowed to recurse > > into the filesystem? If I follow the spaghetti of inline and > > compiler inlined functions correctly, this is a GFP_HIGHUSER_MOVABLE > > allocation, right? Should we be allowing shrink_icache_memory() > > to be called at all in the page fault path? > > Well, the page fault path is able to go to sleep and can enter direct > reclaim under low memory situations. Right now, I'm failing to see why a > page fault should not be allowed to reclaim pages in use by a > filesystem. It was allowed before so the question still is why the > circular lock warning appears now but didn't before. Yeah, it's the fact that this is the first time that this lockdep warning has come up that prompted me to ask the question. I know that we are not allowed to lock an inode in the fault path as that can lead to deadlocks in the read and write paths, so what I was really wondering is if we can deadlock in a more convoluted manner by taking locks on *other inodes* in the page fault path.... Cheers, Dave. -- Dave Chinner david@... |
|
|
Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)...On Mon, Jun 23, 2008 at 01:24:15AM +0100, Mel Gorman wrote:
> In that case, have you any theory as to why this circular dependency is > being reported now but wasn't before 2.6.26-rc1? I'm beginning to wonder > if the bisecting fingering the zonelist modifiation is just a > co-incidence. I've seen this traces since lockdep was added when running xfsqa. |
|
|
Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)...On (23/06/08 03:22), Christoph Hellwig didst pronounce:
> On Mon, Jun 23, 2008 at 01:24:15AM +0100, Mel Gorman wrote: > > In that case, have you any theory as to why this circular dependency is > > being reported now but wasn't before 2.6.26-rc1? I'm beginning to wonder > > if the bisecting fingering the zonelist modifiation is just a > > co-incidence. > > I've seen this traces since lockdep was added when running xfsqa. > Oh right, so this isn't even 2.6.26-rc1 as such. It's an older problem that seems to be happening in more cases now, right? At this point, I believe the bisection fingering the zonelist modification was a co-incidence as reclaim behaviour at least is equivilant although catching the memory leak early was a lucky positive outcome. It's still not clear why the circular warning is happening more regularly now but it's "something else". Considering the number of changes made to NFS, XFS, reclaim and other areas since, I'm not sure how to go about finding the real underlying problem or if it can be dealt with in a trivial manner. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab |
| Free embeddable forum powered by Nabble | Forum Help |