[Ocfs2-devel] [PATCH] ocfs2/dlm: remove potential deadlock

Wengang Wang wen.gang.wang at oracle.com
Mon Jun 21 06:44:44 PDT 2010


On 10-06-21 15:33, Tao Ma wrote:
> hi wengang,
> 
> On 06/21/2010 01:31 PM, Wengang Wang wrote:
> >Why atomic operations on dlm_refs need spinlock's protect?
> >
> >         /* NOTE: Next three are protected by dlm_domain_lock */
> >         struct kref dlm_refs;
> >         enum dlm_ctxt_state dlm_state;
> >         unsigned int num_joins;
> I don't dive into the code, so if there is something wrong, please
> correct me.
> AFAICS, kref_put isn't an atomic operation, it need to call release
> to free the container of kref.

Yes, it's bad :-(.

regards,
wengang.
> 
> Regards,
> Tao
> >
> >regards,
> >wengang.
> >
> >On 10-06-16 14:52, Wengang Wang wrote:
> >>When we need to take both dlm_domain_lock and dlm->spinlock, we should take
> >>them in order of:
> >>dlm_domain_lock then dlm->spinlock.
> >>
> >>There is pathes disobey this order. That is calling dlm_lockres_put with dlm->
> >>spinlock held. dlm_lockres_put() finally calls dlm_put() which take
> >>dlm_domain_lock.
> >>
> >>The fix is moving the locking on dlm_domain_lock to dlm_ctxt_release() from
> >>dlm_put(). dlm_ctxt_release() is only called on the release of the last
> >>reference. Any path should not be holding dlm->spinlock when dropping the "last"
> >>reference.
> >>
> >>Signed-off-by: Wengang Wang<wen.gang.wang at oracle.com>
> >>---
> >>  fs/ocfs2/dlm/dlmdomain.c |   12 ++++--------
> >>  1 files changed, 4 insertions(+), 8 deletions(-)
> >>
> >>diff --git a/fs/ocfs2/dlm/dlmdomain.c b/fs/ocfs2/dlm/dlmdomain.c
> >>index ab82add..754baf2 100644
> >>--- a/fs/ocfs2/dlm/dlmdomain.c
> >>+++ b/fs/ocfs2/dlm/dlmdomain.c
> >>@@ -321,28 +321,24 @@ static void dlm_ctxt_release(struct kref *kref)
> >>
> >>  	dlm = container_of(kref, struct dlm_ctxt, dlm_refs);
> >>
> >>+	if (spin_is_locked(&dlm->spinlock))
> >>+		BUG();
> >>  	BUG_ON(dlm->num_joins);
> >>  	BUG_ON(dlm->dlm_state == DLM_CTXT_JOINED);
> >>
> >>+	spin_lock(&dlm_domain_lock);
> >>  	/* we may still be in the list if we hit an error during join. */
> >>  	list_del_init(&dlm->list);
> >>-
> >>  	spin_unlock(&dlm_domain_lock);
> >>
> >>-	mlog(0, "freeing memory from domain %s\n", dlm->name);
> >>-
> >>  	wake_up(&dlm_domain_events);
> >>-
> >>+	mlog(0, "freeing memory from domain %s\n", dlm->name);
> >>  	dlm_free_ctxt_mem(dlm);
> >>-
> >>-	spin_lock(&dlm_domain_lock);
> >>  }
> >>
> >>  void dlm_put(struct dlm_ctxt *dlm)
> >>  {
> >>-	spin_lock(&dlm_domain_lock);
> >>  	kref_put(&dlm->dlm_refs, dlm_ctxt_release);
> >>-	spin_unlock(&dlm_domain_lock);
> >>  }
> >>
> >>  static void __dlm_get(struct dlm_ctxt *dlm)
> >>--
> >>1.6.6.1
> >>
> >>
> >>_______________________________________________
> >>Ocfs2-devel mailing list
> >>Ocfs2-devel at oss.oracle.com
> >>http://oss.oracle.com/mailman/listinfo/ocfs2-devel
> >
> >_______________________________________________
> >Ocfs2-devel mailing list
> >Ocfs2-devel at oss.oracle.com
> >http://oss.oracle.com/mailman/listinfo/ocfs2-devel



More information about the Ocfs2-devel mailing list