[Ocfs2-devel] [PATCH] Fix waiting status race condition in dlm recovery

Xiaowei xiaowei.hu at oracle.com
Fri May 25 19:05:14 PDT 2012


Thanks Srini ,
This sounds good, I tried to use dlm_is_node_dead in this patch , but 
this function can't report
another node is dead if this node already in recovery process. It was 
blocked to set the bit in domain_map,
but the live_nodes_map could always reflect the really live nodes.

I will reformat the patch.

Thanks,
Xiaowei

On 05/26/2012 06:17 AM, srinivas eeda wrote:
> comments inline
>
> On 5/24/2012 10:53 PM, xiaowei.hu at oracle.com wrote:
>> From: "Xiaowei.Hu"<xiaowei.hu at oracle.com>
>>
>> when the master requested locks ,but one/some of the live nodes died,
>> after it received the request msg and before send out the locks 
>> packages,
>> the recovery will fall into endless loop,waiting for the status 
>> changed to finalize
>>
>> NodeA                                     NodeB
>> selected as recovery master
>> dlm_remaster_locks
>>    ->  dlm_requeset_all_locks
>>    this send request locks msg to B
>>                                            received the msg from A,
>>                                            queue worker 
>> dlm_request_all_locks_worker
>>                                            return 0
>> go on set state to requested
>> wait for the state become done
>>                                            NodeB lost connection due 
>> to network
>>                                            before the worker begin, 
>> or it die.
>> NodeA still waiting for the
>> change of reco state.
>> It won't end if it not get data done msg
>> And at this time nodeB do not realize this (or it just died),
>> it won't send the msg for ever, nodeA left in the recovery process 
>> forever.
>>
>> This patch let the recovery master check if the node still in live node
>> map when it stay in REQUESTED status.
>>
>> Signed-off-by: Xiaowei.Hu<xiaowei.hu at oracle.com>
>> ---
>>   fs/ocfs2/dlm/dlmrecovery.c |    9 +++++++++
>>   1 files changed, 9 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>> index 01ebfd0..62659e8 100644
>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>> @@ -555,6 +555,7 @@ static int dlm_remaster_locks(struct dlm_ctxt 
>> *dlm, u8 dead_node)
>>       int all_nodes_done;
>>       int destroy = 0;
>>       int pass = 0;
>> +    int dying = 0;
>>
>>       do {
>>           /* we have become recovery master.  there is no escaping
>> @@ -659,6 +660,7 @@ static int dlm_remaster_locks(struct dlm_ctxt 
>> *dlm, u8 dead_node)
>>           list_for_each_entry(ndata,&dlm->reco.node_data, list) {
>>               mlog(0, "checking recovery state of node %u\n",
>>                    ndata->node_num);
>> +            dying = 0;
>>               switch (ndata->state) {
>>                   case DLM_RECO_NODE_DATA_INIT:
>>                   case DLM_RECO_NODE_DATA_REQUESTING:
>> @@ -679,6 +681,13 @@ static int dlm_remaster_locks(struct dlm_ctxt 
>> *dlm, u8 dead_node)
>>                            dlm->name, ndata->node_num,
>>                            ndata->state==DLM_RECO_NODE_DATA_RECEIVING ?
>>                            "receiving" : "requested");
>> +                    spin_lock(&dlm->spinlock);
>> +                    dying = !test_bit(ndata->node_num, 
>> dlm->live_nodes_map);
>> +                    spin_unlock(&dlm->spinlock);
>> +                    if (dying) {
>> +                        ndata->state = DLM_RECO_NODE_DATA_DEAD;
>> +                        break;
>> +                    }
>>                       all_nodes_done = 0;
>>                       break;
>>                   case DLM_RECO_NODE_DATA_DONE:
> fix seems to address the issue, but can you please add a function 
> dlm_is_node_in_livemap similar to dlm_is_node_dead so that it' 
> improves readability. You can then add the following to check if the 
> node is still alive
> +        if (!dlm_is_node_in_livemap(dlm, ndata->node_num))
> +            ndate->state = DLM_RECO_NODE_DATA_DEAD;
> +        else
> +            all_nodes_done = 0;




More information about the Ocfs2-devel mailing list