|
View:
New views
1 Messages
—
Rating Filter:
Alert me
|
|
|
[ ssic-linux-Bugs-1811510 ] deadlock on loop mounted fsBugs item #1811510, was opened at 2007-10-11 08:22
Message generated for change (Settings changed) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1811510&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Filesystem Group: default Status: Open >Resolution: Fixed Priority: 5 Private: No Submitted By: John Hughes (hughesj) Assigned to: Roger Tsang (rogertsang) Summary: deadlock on loop mounted fs Initial Comment: 1. Make a sparse file perl -e 'open BIGFILE, ">BIGFILE"; seek BIGFILE, 1024 * 1024 * 1024, 0; print BIGFILE "big"' 2. make a filesystem on it losetup /dev/loop/0 BIGFILE mkfs -t ext3 /dev/loop/0 3. mount it mount -t ext3 /dev/loop/0 /mnt 4. write a lot of files to it cd /mnt dump 0f - / | restore rf - eventualy the node where we are writing to the loopback mounted fs gets deadlocked. It's still up as far as the cluster is concerned, but any attempt to start a process on it blocks. ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-03-24 01:06 Message: checked-in but needs verification ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-03-22 17:35 Message: Testing new code to associate separate BDI per mount. This should allow us to support recursively stacked CFS. ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2008-10-10 10:52 Message: checked-in ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2008-06-19 02:32 Message: Logged In: YES user_id=1246761 Originator: NO Try the attached patch. More work would need to be done to pass a flag to kernel space for CFS to use a different congestion bit in the case of CFS on loopback. However the proposed solution only works if you are not going to CFS mount another loopback on top of a CFS mount on loopback on CFS. So the simple fix would be this patch. Loopback becomes a standard mount. File Added: util-linux.1811510.patch ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2008-03-16 20:35 Message: Logged In: YES user_id=1246761 Originator: NO Should be fixed in 2.0.0pre3... ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2007-10-20 21:58 Message: Logged In: YES user_id=1246761 Originator: NO It looks like CFS ran out of memory. Try the latest checkin of kernel/cluster/ssi/cfs code that re-enables commit for soft mounts. ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2007-10-20 14:42 Message: Logged In: YES user_id=1246761 Originator: NO Does 2.6.10-ssi run into this bug? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2007-10-16 10:33 Message: Logged In: NO Still looks the same as the old bug... This time it is stacked generic_file_writev(). cfs_async (has i_sem) loop0 pdflush kjournald cfs_async (waiting for i_sem) ---------------------------------------------------------------------- Comment By: John Hughes (hughesj) Date: 2007-10-12 07:36 Message: Logged In: YES user_id=166336 Originator: YES Here's some debugging. I've got to the point where the "restore" process on node 1 seems hung. On node 2 I try an "onnode 1 pwd". It hangs. One node 1: Entering kdb (current=0xc0502bc0, pid 0) on processor 0 due to Keyboard Entry [0]kdb> ps 1 idle process (state I) and 50 sleeping system daemon (state M) processes suppressed Task Addr Pid Parent [*] cpu State Thread Command 0xcf82a5d0 5 2 0 0 R 0xcf82a7b0 events/0 0xcf68b990 117 11 0 0 D 0xcf68bb70 pdflush 0xcf68a310 121 2 0 0 D 0xcf68a4f0 cfs_async 0xcf6b99b0 122 2 0 0 D 0xcf6b9b90 cfs_async 0xcf6b9410 123 2 0 0 D 0xcf6b95f0 cfs_async 0xcf6b8e70 124 2 0 0 D 0xcf6b9050 cfs_async 0xcf6b88d0 125 2 0 0 D 0xcf6b8ab0 cfs_async 0xcf6b8330 126 2 0 0 D 0xcf6b8510 cfs_async 0xcf6c99d0 127 2 0 0 D 0xcf6c9bb0 cfs_async 0xcf6c9430 128 2 0 0 D 0xcf6c9610 cfs_async 0xce92b730 1 0 0 0 D 0xce92b910 init [...] 0xce90b150 67763 2 0 0 D 0xce90b330 loop0 0xce90d170 67820 2 0 0 D 0xce90d350 kjournald 0xce8f96d0 67822 67636 0 0 S 0xce8f98b0 dump 0xce8f8b90 67823 67636 0 0 D 0xce8f8d70 restore 0xcf13f970 67824 67822 0 0 S 0xcf13fb50 dump 0xcf13f3d0 67825 67824 0 0 S 0xcf13f5b0 dump 0xcf7861f0 67826 67824 0 0 S 0xcf7863d0 dump 0xcf786790 67827 67824 0 0 S 0xcf786970 dump 0xcf47d9b0 132773 2 0 0 D 0xcf47db90 onnode [0]kdb> btp 132773 Stack traceback for pid 132773 0xcf47d9b0 132773 2 0 0 D 0xcf47db90 onnode EBP EIP Function (args) 0xce879ba8 0xc046c2e6 schedule+0x3a6 (0xce879c10) 0xce879bb4 0xc046d348 io_schedule+0x28 (0xc1271c70) 0xce879bc0 0xc014aed5 sync_page+0x45 (0xc10c37f8, 0x0, 0xc014ae90, 0xcf47d9b0, 0xce879c10) 0xce879be0 0xc046d6fe __wait_on_bit_lock+0x5e (0x2, 0xc10c37f8, 0xc10c37f8, 0x0, 0x0) 0xce879c3c 0xc014b744 __lock_page+0x84 (0xc049efb5, 0xa7, 0xce7c31a0, 0x0, 0x1) 0xce879cc4 0xc014beeb do_generic_mapping_read+0x3db (0xce88ca00, 0xce7c31f0, 0xce7c31a0, 0xce879e00, 0xce879d00) 0xce879d1c 0xc014c3ed __generic_file_aio_read+0x1ed (0xce879dc4, 0xce879d34, 0x1, 0xce879e00, 0xcf06d600) 0xce879d48 0xc014c473 generic_file_aio_read+0x53 (0xce879dc4, 0xcf06d600, 0x80, 0x0, 0x0) 0xce879d84 0xc028375a __cfs_file_read+0xaa (0xce879dc4, 0x0, 0xcf06d600, 0x80, 0xce879da0) 0xce879da8 0xc0283828 cfs_file_aio_read+0x38 (0xce879dc4, 0xcf06d600, 0x80, 0x0, 0x0) 0xce879e50 0xc016c3b3 do_sync_read+0xa3 (0xce7c31a0, 0xcf06d600, 0x80, 0xce879e8c, 0xce879000) 0xce879e74 0xc016c490 vfs_read+0xb0 (0xce7c31a0, 0xcf06d600, 0x80, 0xce879e8c, 0x0) 0xce879e9c 0xc017895a kernel_read+0x4a (0xce7c31a0, 0x0, 0xcf06d600, 0x80, 0xcf06d600) 0xce879ec0 0xc017946a prepare_binprm+0xca (0xcf06d600, 0x7fff, 0xc13b4080, 0x0, 0x0) 0xce879eec 0xc0179a16 ssi_do_execve+0x1a6 (0xcf012920, 0xce6f8800, 0xcf6aa400, 0xce879fa0, 0x0) 0xce879f78 0xc0245c3a rexecve_server+0xea (0xcf50e000, 0xcf47d9b0, 0xcf012920, 0xce6f8800, 0xcf6aa400) 0xce879fec 0xc02454f5 rexecve_server_setup+0x55 0xc01023a5 kernel_thread_helper+0x5 [0]kdb> btp 67823 Stack traceback for pid 67823 0xce8f8b90 67823 67636 0 0 D 0xce8f8d70 restore EBP EIP Function (args) 0xca2fcea0 0xc046c2e6 schedule+0x3a6 (0x0, 0xce8f8b90, 0xc013f0a0, 0xca2fced4, 0xca2fced4) 0xca2fcef4 0xc029f3ba cfs_wait_on_request+0x7a (0xc9a8c200, 0xca2fcf14, 0x0, 0x1, 0x0) 0xca2fcf24 0xc0285a9e cfs_wait_on_requests+0x8e (0xccb63be4, 0x0, 0x0, 0x0, 0xce7c3600) 0xca2fcf48 0xc0286f66 cfs_sync_inode+0x76 (0xccb63be4, 0x0, 0x0, 0x2, 0x0) 0xca2fcf80 0xc0283653 cfs_file_flush+0x93 (0xce7c3600, 0x81a4, 0xccdef200, 0x5, 0xccdef204) 0xca2fcf9c 0xc016bb3c filp_close+0x6c (0xce7c3600, 0xccdef200, 0xce7c3600, 0x5, 0x0) 0xca2fcfbc 0xc016bbce sys_close+0x6e 0xc0105a3b syscall_call+0x7 [0]kdb> [0]kdb> btp 67763 Stack traceback for pid 67763 0xce90b150 67763 2 0 0 D 0xce90b330 loop0 EBP EIP Function (args) 0xca488db8 0xc046c2e6 schedule+0x3a6 (0xca488e20) 0xca488dc4 0xc046d348 io_schedule+0x28 (0xc12711e0) 0xca488dd0 0xc014aed5 sync_page+0x45 (0xc11d6be0, 0x0, 0xc014ae90, 0xce90b150, 0xca488e20) 0xca488df0 0xc046d6fe __wait_on_bit_lock+0x5e (0x2, 0xc11d6be0, 0xc11d6be0, 0x0, 0x0) 0xca488e4c 0xc014b744 __lock_page+0x84 (0xc049efb5, 0xa7, 0xcd6ca600, 0x38002, 0x1) 0xca488ed4 0xc014beeb do_generic_mapping_read+0x3db (0xcb632f40, 0xcd6ca650, 0xcd6ca600, 0xca488f58, 0xca488ef4) 0xca488f04 0xc014c61b generic_file_sendfile+0x5b (0xcd6ca600, 0xca488f58, 0x1000, 0xd08f15d0, 0xca488f60) 0xca488f3c 0xc02838bd cfs_file_sendfile+0x8d (0xcd6ca600, 0xca488f58, 0x1000, 0xd08f15d0, 0xca488f60) 0xca488f74 0xd08f16fc [loop]do_lo_receive+0x5c (0xc9353000, 0xc4279630, 0x1000, 0x38002000, 0x0) 0xca488fa4 0xd08f176e [loop]lo_receive+0x5e (0xc9353000, 0xc1ed33e0, 0x1000, 0x38002000, 0x0) 0xca488fc8 0xd08f17eb [loop]do_bio_filebacked+0x4b (0xc9353000, 0xc1ed33e0, 0x0, 0xc9353138, 0xd08f1a60) 0xca488fec 0xd08f1b3b [loop]loop_thread+0xdb 0xc01023a5 kernel_thread_helper+0x5 ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2007-10-11 22:21 Message: Logged In: YES user_id=1246761 Originator: NO Sounds like [ 686748 ] Filesystem stacking deadlock. ---------------------------------------------------------------------- Comment By: John Hughes (hughesj) Date: 2007-10-11 08:22 Message: Logged In: YES user_id=166336 Originator: YES This is with the 2.6.11 kernel ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1811510&group_id=32541 ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ ssic-linux-devel mailing list ssic-linux-devel@... https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel |
| Free embeddable forum powered by Nabble | Forum Help |