[Ocfs2-devel] [PATCH] NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock
Mark Fasheh
mfasheh at suse.de
Tue Dec 22 12:43:58 PST 2015
Reviewed-by: Mark Fasheh <mfasheh at suse.de>
On Tue, Dec 22, 2015 at 12:14:22PM -0800, Tariq Saeed wrote:
> Hi,
> Looks like this fell into through the cracks. This is a very real bug
> encountered by Luminex Software and they tested the fix.
> Regards
> -Tariq
>
>
> -------- Forwarded Message --------
> Subject: [Ocfs2-devel] [PATCH] NFS hangs in __ocfs2_cluster_lock due to race
> with ocfs2_unblock_lock
> Date: Tue, 25 Aug 2015 13:55:30 -0700
> From: Tariq Saeed <tariq.x.saeed at oracle.com>
> To: ocfs2-devel at oss.oracle.com
> CC: mfasheh at suse.de
>
>
>
> Orabug: 20933419
>
> NFS on a 2 node ocfs2 cluster each node exporting dir. The lock causing
> the hang is the global bit map inode lock. Node 1 is master, has
> the lock granted in PR mode; Node 2 is in the converting list (PR ->
> EX). There are no holders of the lock on the master node so it should
> downconvert to NL and grant EX to node 2 but that does not happen.
> BLOCKED + QUEUED in lock res are set and it is on osb blocked list.
> Threads are waiting in __ocfs2_cluster_lock on BLOCKED. One thread wants
> EX, rest want PR. So it is as though the downconvert thread needs to be
> kicked to complete the conv.
>
> The hang is caused by an EX req coming into __ocfs2_cluster_lock on
> the heels of a PR req after it sets BUSY (drops l_lock, releasing EX
> thread), forcing the incoming EX to wait on BUSY without doing anything.
> PR has called ocfs2_dlm_lock, which sets the node 1 lock from NL ->
> PR, queues ast.
>
> At this time, upconvert (PR ->EX) arrives from node 2, finds conflict with
> node 1 lock in PR, so the lock res is put on dlm thread's dirty listt.
>
> After ret from ocf2_dlm_lock, PR thread now waits behind EX on BUSY till
> awoken by ast.
>
> Now it is dlm_thread that serially runs dlm_shuffle_lists, ast, bast,
> in that order. dlm_shuffle_lists ques a bast on behalf of node 2
> (which will be run by dlm_thread right after the ast). ast does its
> part, sets UPCONVERT_FINISHING, clears BUSY and wakes its waiters. Next,
> dlm_thread runs bast. It sets BLOCKED and kicks dc thread. dc thread
> runs ocfs2_unblock_lock, but since UPCONVERT_FINISHING set, skips doing
> anything and reques.
>
> Inside of __ocfs2_cluster_lock, since EX has been waiting on BUSY ahead
> of PR, it wakes up first, finds BLOCKED set and skips doing anything
> but clearing UPCONVERT_FINISHING (which was actually "meant" for the
> PR thread), and this time waits on BLOCKED. Next, the PR thread comes
> out of wait but since UPCONVERT_FINISHING is not set, it skips updating
> the l_ro_holders and goes straight to wait on BLOCKED. So there, we
> have a hang! Threads in __ocfs2_cluster_lock wait on BLOCKED, lock
> res in osb blocked list. Only when dc thread is awoken, it will run
> ocfs2_unblock_lock and things will unhang.
>
> One way to fix this is to wake the dc thread on the flag after clearing
> UPCONVERT_FINISHING
>
> Signed-off-by: Tariq Saeed <tariq.x.saeed at oracle.com>
> Reviewed-by: Wengang Wang <wen.gang.wang at oracle.com>
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar at oracle.com>
> ---
> fs/ocfs2/dlmglue.c | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
> index 8b23aa2..313c816 100644
> --- a/fs/ocfs2/dlmglue.c
> +++ b/fs/ocfs2/dlmglue.c
> @@ -1390,6 +1390,7 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb,
> unsigned int gen;
> int noqueue_attempted = 0;
> int dlm_locked = 0;
> + int kick_dc = 0;
>
> if (!(lockres->l_flags & OCFS2_LOCK_INITIALIZED)) {
> mlog_errno(-EINVAL);
> @@ -1524,7 +1525,12 @@ update_holders:
> unlock:
> lockres_clear_flags(lockres, OCFS2_LOCK_UPCONVERT_FINISHING);
>
> + /* ocfs2_unblock_lock reques on seeing OCFS2_LOCK_UPCONVERT_FINISHING */
> + kick_dc = (lockres->l_flags & OCFS2_LOCK_BLOCKED);
> +
> spin_unlock_irqrestore(&lockres->l_lock, flags);
> + if (kick_dc)
> + ocfs2_wake_downconvert_thread(osb);
> out:
> /*
> * This is helping work around a lock inversion between the page lock
> --
> 1.7.1
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
>
>
--
Mark Fasheh
More information about the Ocfs2-devel
mailing list