[Ocfs2-devel] [PATCH] ocfs2: fix __ocfs2_cluster_lock() dead lock

Joel Becker Joel.Becker at oracle.com
Mon Jan 11 17:59:46 PST 2010


[Cc'd Mark and Dave]

On Sun, Jan 10, 2010 at 02:05:22AM +0800, Wengang Wang wrote:
> > 2) It sure looks like you're right.
> well, this doesn't sounds like a sob :)

	With good reason, too.  You've got a bug.  Dave Teigland (copied
now) found it while testing with fs/dlm.
	The short description is that you didn't notice the handling of
async -EAGAIN.  Not really your fault, there are no obvious comments.
With o2dlm, you get -EAGAIN from the call to ocfs2_dlm_lock().  That's
safe with your patch.  However, with fs/dlm, EAGAIN comes
asynchronously.  ocfs2_dlm_lock() will return 0.  When the ast fires,
the ast notices the -EAGAIN and clears the locking state for a retry.
In your patch, however, you take the 'lock_done' + !BUSY and immediately
increment the holders - even though the lock hasn't been acquired!
	The fix is pretty simple, though - when lock_done is set, you
know we've at least tried to upconvert.  If the locking level is good,
we know the upconvert succeeded.  We can then go to inc_holders.  I've
attached a new version of your patch that does this.
	I've also modified the patch to reflect that the original
problem you found is actually a livelock, not a deadlock.  And I added a
little comment about the async -EAGAIN.
	I've attached the full patch with my changes.  Dave, please test
my version (the attached one) instead of Wengang's.  Mark, can you do a
sanity check here?
	Here's the diff between your patch and mine:

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 2788b8e..eabe53e 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -1313,7 +1313,7 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb,
 	unsigned long flags;
 	unsigned int gen;
 	int noqueue_attempted = 0;
-	int lock_done = 0;
+	int lock_attempted = 0;
 
 	mlog_entry_void();
 
@@ -1348,18 +1348,25 @@ again:
 		goto unlock;
 	}
 
-	if (lock_done) {
-		/* cluster lock completed already, don't wait for down-
-		 * convertion to finish.
-		 * if we do, in case another DC happen to be there(not on behalf
-		 * of this lock request), and ocfs2_may_continue_on_blocked_lock()
-		 * returns false(in case we want an EX), we go into deadlock:
-		 * DC is waiting for our release of the EX(check_downconvert())
-		 * and we are waiting for that DC.
+	if (lock_attempted && (lockres->l_level >= level)) {
+		/*
+		 * We've attempted to upconvert, and the lock now has
+		 * a level we can work with.  If we fell through to the
+		 * next checks, we could spin in an upconvert->downconvert
+		 * cycle as other nodes pounded us.  Instead, we jump
+		 * out and let the caller do some work.  If a downconvert
+		 * has come in, it will do its thing as soon as the caller
+		 * is done.
 		 */
 		goto update_holders;
 	}
 
+	/*
+	 * This is either the first pass or our level wasn't good enough.
+	 * Treat both cases as a first pass.
+	 */
+	lock_attempted = 0;
+
 	if (lockres->l_flags & OCFS2_LOCK_BLOCKED &&
 	    !ocfs2_may_continue_on_blocked_lock(lockres, level)) {
 		/* is the lock is currently blocked on behalf of
@@ -1370,6 +1377,13 @@ again:
 	}
 
 	if (level > lockres->l_level) {
+		/*
+		 * If the dlm notifies us of -EAGAIN asynchronously, it
+		 * will have returned 0 from the call to ocfs2_dlm_lock()
+		 * and given us -EAGAIN in the ast.  The ast will clear
+		 * out our state.  We notice that here and return -EAGAIN
+		 * to the caller.
+		 */
 		if (noqueue_attempted > 0) {
 			ret = -EAGAIN;
 			goto unlock;
@@ -1427,7 +1441,7 @@ again:
 		catch_signals = 0;
 
 		/* wait for busy to clear and carry on */
-		lock_done = 1;
+		lock_attempted = 1;
 		goto again;
 	}
 
Joel

-- 

"If you are ever in doubt as to whether or not to kiss a pretty girl, 
 give her the benefit of the doubt"
                                        -Thomas Carlyle

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ocfs2-fix-__ocfs2_cluster_lock-livelock.patch
Type: text/x-diff
Size: 3646 bytes
Desc: not available
Url : http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20100111/1264b243/attachment.bin 


More information about the Ocfs2-devel mailing list