[ ssic-linux-Bugs-1925545 ] Processes stuck in I/O after transparent CFS failover

View: New views
1 Messages — Rating Filter:   Alert me  

[ ssic-linux-Bugs-1925545 ] Processes stuck in I/O after transparent CFS failover

by SourceForge.net :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bugs item #1925545, was opened at 2008-03-25 17:35
Message generated for change (Settings changed) made by rogertsang
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1925545&group_id=32541

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Filesystem
Group: v1.9.1
Status: Open
>Resolution: Fixed
Priority: 5
>Private: No
Submitted By: Roger Tsang (rogertsang)
Assigned to: Roger Tsang (rogertsang)
Summary: Processes stuck in I/O after transparent CFS failover

Initial Comment:
Processes or threads doing I/O during transparent CFS failover get stuck waiting in I/O.  They cannot be interrupted.  If they are part of a thread group zombies can also appear.

How to reproduce:  Start reading a large file on a CFS hard mount on the surviving OpenSSI node and force CFS to transparently failover.  The application on the surviving OpenSSI node is expected to continue uninterrupted and finish reading the entire file, but instead gets stuck waiting in I/O.

----------------------------------------------------------------------

>Comment By: Roger Tsang (rogertsang)
Date: 2009-10-26 23:57

Message:
checked-in final fix

----------------------------------------------------------------------

Comment By: Roger Tsang (rogertsang)
Date: 2008-07-03 23:21

Message:
Logged In: YES
user_id=1246761
Originator: YES

Fixed but not yet in code repository.

----------------------------------------------------------------------

Comment By: Roger Tsang (rogertsang)
Date: 2008-03-25 18:04

Message:
Logged In: YES
user_id=1246761
Originator: YES

NB: This bug might not manifest in the earlier 1.9.x releases because the
CFS rebuild thread is RT before (around) 1.9.3.

----------------------------------------------------------------------

Comment By: Roger Tsang (rogertsang)
Date: 2008-03-25 17:53

Message:
Logged In: YES
user_id=1246761
Originator: YES

This is a CFS super block rebuild flush race with CFS async code where
down requests being flushed are asynchronously pushed back into the list of
down requests.  These requests have the page lock.

Fixed in 2.0.0pre3.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1925545&group_id=32541

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
ssic-linux-devel mailing list
ssic-linux-devel@...
https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel