[Ocfs2-devel] [PATCH] remove lockres from purge list when we are getting it for creating lock

Thu Jun 9 18:01:38 PDT 2011

On 11-06-09 10:53, Sunil Mushran wrote:
> On 06/08/2011 03:04 AM, Wengang Wang wrote:
> >When we are to purge a lockres, we check if it's unused.
> >the check includes
> >1. no locks attached to the lockres.
> >2. not dirty.
> >3. not in recovery.
> >4. no interested nodes in refmap.
> >If the the above four are satisfied, we are going to purge it(remove it
> >from hash table).
> >
> >While, when a "create lock" is in progress especially when lockres is owned
> >remotely(spinlocks are dropped when networking), the lockres can satisfy above
> >four condition. If it's put to purge list, it can be purged out from hash table
> >when we are still accessing on it(sending request to owner for example). That's
> >not what we want.
> >
> >I have met such a problem (orabug 12405575).
> >The lockres is in the purge list already(there is a delay for real purge work)
> >and the create lock request comes. When we are sending network message to the
> >owner in dlmlock_remote(), the owner crashed. So we get DLM_RECOVERING as return
> >value and retries dlmlock_remote(). And before the owner crash, we have purged
> >the lockres. So the lockres become stale(on lockres->onwer). Thus the code calls
> >dlmlock_remote() infinitely.
> >
> >fix:
> >we remove the lockres from purge list if it's there in dlm_get_lock_resource()
> >which is called for only createlock case. So that the lockres can't be purged
> >when we are in progress of createlock.
> >
> >Signed-off-by: Wengang Wang<wen.gang.wang at oracle.com>
> >---
> >  fs/ocfs2/dlm/dlmmaster.c |   41 +++++++++++++++++++++++++++++------------
> >  1 files changed, 29 insertions(+), 12 deletions(-)
> >
> >diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
> >index 11eefb8..511d43c 100644
> >--- a/fs/ocfs2/dlm/dlmmaster.c
> >+++ b/fs/ocfs2/dlm/dlmmaster.c
> >@@ -709,28 +709,27 @@ lookup:
> >  	spin_lock(&dlm->spinlock);
> >  	tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
> >  	if (tmpres) {
> >-		int dropping_ref = 0;
> >-
> >-		spin_unlock(&dlm->spinlock);
> >-
> >  		spin_lock(&tmpres->spinlock);
> >  		/* We wait for the other thread that is mastering the resource */
> >  		if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
> >+			spin_unlock(&dlm->spinlock);
> >  			__dlm_wait_on_lockres(tmpres);
> >  			BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
> >+			spin_unlock(&tmpres->spinlock);
> >+			dlm_lockres_put(tmpres);
> >+			tmpres = NULL;
> >+			goto lookup;
> >  		}
> >
> >  		if (tmpres->owner == dlm->node_num) {
> >  			BUG_ON(tmpres->state&  DLM_LOCK_RES_DROPPING_REF);
> >  			dlm_lockres_grab_inflight_ref(dlm, tmpres);
> >-		} else if (tmpres->state&  DLM_LOCK_RES_DROPPING_REF)
> >-			dropping_ref = 1;
> >-		spin_unlock(&tmpres->spinlock);
> >-
> >-		/* wait until done messaging the master, drop our ref to allow
> >-		 * the lockres to be purged, start over. */
> >-		if (dropping_ref) {
> >-			spin_lock(&tmpres->spinlock);
> >+		} else if (tmpres->state&  DLM_LOCK_RES_DROPPING_REF) {
> >+			/*
> >+			 * wait until done messaging the master, drop our ref to
> >+			 * allow the lockres to be purged, start over.
> >+			 */
> >+			spin_unlock(&dlm->spinlock);
> >  			__dlm_wait_on_lockres_flags(tmpres, DLM_LOCK_RES_DROPPING_REF);
> >  			spin_unlock(&tmpres->spinlock);
> >  			dlm_lockres_put(tmpres);
> >@@ -739,6 +738,24 @@ lookup:
> >  		}
> >
> >  		mlog(0, "found in hash!\n");
> >+		/*
> >+		 * we are going to do a create-lock next. so remove the lockres
> >+		 * from purge list to avoid the case that we will access on the
> >+		 * lockres which is already purged out from hash table in
> >+		 * dlm_run_purge_list() path.
> >+		 * otherwise, we could run into a problem:
> >+		 * the owner dead(recovery didn't take care of this lockres
> >+		 * since it's not in hashtable), and the code keeps sending
> >+		 * request to the dead node and getting DLM_RECOVERING and
> >+		 * then retrying infinitely.
> >+		 */
> >+		if (!list_empty(&tmpres->purge)) {
> >+			list_del_init(&tmpres->purge);
> >+			dlm_lockres_put(tmpres);
> >+		}
> >+		
> >+		spin_unlock(&tmpres->spinlock);
> >+		spin_unlock(&dlm->spinlock);
> >  		if (res)
> >  			dlm_lockres_put(res);
> >  		res = tmpres;
> 
> In short, you are holding onto the dlm->spinlock a bit longer and forcibly
> removing the lockres from the purgelist.

Yes. that's what the patch does.
> 
> I have two problems with this patch.
> 
> Firstly it ignores the fact that the resource can be added to the purgelist
> right after we drop the dlm->spinlock. There is nothing to protect against
> that. And I would think that is the more likely case. I had asked to you explore
> inflight_locks for that reason. Did you explore that option? Currently it is used
> for remote lock creates. That's why I suggested we use it for local creates too.

Yes, you are right. There is such a problem that the lockres can be
added to purge list after we drop dlm->spinlock. I am not sure if it's the more
likely case since there is a 8 seconds delay between the the time the lockres is
added to purge list and the time it is purged. I guess in normal case, the create-
lock should already finished in the 8 seconds.

I considered about the inflight_locks. But I didn't think it well so far.
Because simply moving the lockres out from purge list is not good
enough, I will take a good think about making use of inflight_locks. 

> Secondly, we currently manipulate the purgelist in one function only.
> __dlm_calc_lockres_usage(). We should stick to that.

If we make good use of inflight_locks, I think we can.
> 
> BTW, how are you testing this?

I tested with the steps which can cause the original problem to prove it
works.(but without UT).
Also I made a regression test with the existing ocfs2-test suites(the
multiple-xxx with reducing interations).

> I would think this issue will be more of an issue for userdlm (ovm). Not the fs.

Yes, agree.

thanks,
wengang.