[Ocfs2-devel] [PATCH] Fix waiting status race condition in dlm recovery V2
Xue jiufei
xuejiufei at huawei.com
Mon Jun 24 05:57:03 PDT 2013
Hi, xiaowei,
On 2013/6/24 12:57, xiaowei.hu wrote:
> On 06/23/2013 07:12 PM, Jeff Liu wrote:
>> On 06/18/2013 11:13 AM, Xue jiufei wrote:
>>
>>>> From: "Xiaowei.Hu" <xiaowei.hu at oracle.com>
>>>>
>>>> when the master requested locks ,but one/some of the live nodes died,
>>>> after it received the request msg and before send out the locks packages,
>>>> the recovery will fall into endless loop,waiting for the status changed to finalize
>>>>
>>>> NodeA NodeB
>>>> selected as recovery master
>>>> dlm_remaster_locks
>>>> -> dlm_requeset_all_locks
>>>> this send request locks msg to B
>>>> received the msg from A,
>>>> queue worker dlm_request_all_locks_worker
>>>> return 0
>>>> go on set state to requested
>>>> wait for the state become done
>>>> NodeB lost connection due to network
>>>> before the worker begin, or it die.
>>>>
>>>> NodeA still waiting for the change of reco state.
>>>> It won't end if it not get data done msg.
>>>> And at this time nodeB do not realize this (or it just died),
>>>> it won't send the msg for ever, nodeA left in the recovery process forever.
>>>>
>>>> This patch let the recovery master check if the node still in live node
>>>> map when it stay in REQUESTED status.
>>>>
>>> Hi, xiaowei,
>>> We have reviewed this patch and have some questions:
>>> 1) in dlm_is_node_in_livemap(), I think it should use
>>> !!(test_bit(node, dlm->live_nodes_map)) to determine whether a node is
>>> live;
>> Hmm? test_bit(node,...) is ok.
>>
>>> 2) why not use dlm_is_node_dead() instead of dlm_is_node_in_livemap()
>>> in dlm_remaster_locks?
>>> I think dlm_is_node_dead() is better because dlm->live_nodes_map
>>> may be remain when another node umount.
> For this question, I remember I explained in one of the emails, paste you that contant:
> =============
> Hi Sunil,
>
> I considered your suggestion about this patch, it's possible to change the status in dlm hb down event,
> but what need to change are the dlm_reco_node_data structures in dlm->reco.node_data list.
> This list is initialized in dlm_remaster_locks when it begins the lock remaster and destroied before exit this function.
> So it's not proper to check data in such a list from dlm hb down event, am I right?
> If change the status from dlm hb down event , that means we make the recovery thread rely on more information from the hb down event,
> actually the dlm->live_nodes_map is marked in this event , and for others to check , right?
>
> This race condition only happen when cluster already in recovery and a node dead during recovery. the recovery thread blocked the update of dlm->domain_map, so I fallback to check the live_nodes_map, which won't be blocked.
>
> Please reconsider this patch.
> ================
The update of dlm->domain_map is triggered by hb down event or
DLM_BEGIN_RECO_MSG received from recovery master when a node died.
I wonder Why it is blocked by recovery thread.
>> Please refer to:
>> http://marc.info/?l=ocfs2-devel&m=133799814717270&w=2
>>
>> Thanks,
>> -Jeff
>>
>>> Thanks.
>>>
>>>> Signed-off-by: Xiaowei.Hu <xiaowei.hu at oracle.com>
>>>> ---
>>>> fs/ocfs2/dlm/dlmrecovery.c | 16 +++++++++++++++-
>>>> 1 files changed, 15 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>> index 01ebfd0..546c5b5 100644
>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>> @@ -339,6 +339,17 @@ static int dlm_reco_master_ready(struct dlm_ctxt *dlm)
>>>> return ready;
>>>> }
>>>> +/* returns true if node is still in the live node map
>>>> + * this map is cleared before domain map,could be checked in recovery*/
>>>> +int dlm_is_node_in_livemap(struct dlm_ctxt *dlm, u8 node)
>>>> +{
>>>> + int live;
>>>> + spin_lock(&dlm->spinlock);
>>>> + live = !test_bit(node, dlm->live_nodes_map);
>>>> + spin_unlock(&dlm->spinlock);
>>>> + return live;
>>>> +}
>>>> +
>>>> /* returns true if node is no longer in the domain
>>>> * could be dead or just not joined */
>>>> int dlm_is_node_dead(struct dlm_ctxt *dlm, u8 node)
>>>> @@ -679,7 +690,10 @@ static int dlm_remaster_locks(struct dlm_ctxt *dlm, u8 dead_node)
>>>> dlm->name, ndata->node_num,
>>>> ndata->state==DLM_RECO_NODE_DATA_RECEIVING ?
>>>> "receiving" : "requested");
>>>> - all_nodes_done = 0;
>>>> + if (!dlm_is_node_in_livemap(dlm, ndata->node_num))
>>>> + ndata->state = DLM_RECO_NODE_DATA_DEAD;
>>>> + else
>>>> + all_nodes_done = 0;
>>>> break;
>>>> case DLM_RECO_NODE_DATA_DONE:
>>>> mlog(0, "%s: node %u state is done\n",
>>>>
>>>
>>>
>>>
>>> .
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
>
> .
>
More information about the Ocfs2-devel
mailing list