[Ocfs2-devel] [RFC] make ocfs2/o2net reliable

jiangyiwen jiangyiwen at huawei.com
Thu Nov 16 21:50:39 PST 2017


On 2017/11/17 11:53, Changwei Ge wrote:
> Hi Yiwen,
> 
> On 2017/11/17 11:06, jiangyiwen wrote:
>> On 2017/11/16 17:49, Changwei Ge wrote:
>>> Hi all,
>>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>>> Messages might get lost due to a sudden TCP socket connection shutdown.
>> Hi Changwei,
>>
>> Junxiao has already solved the situation about you mentioned.
>> in commit(c43c363def04cdaed0d9e26dae846081f55714e7), it don't shutdown
>> connection until node is fenced, so I don't understand the scenario
>> what you mentioned about TCP socket connection shutdown, can you give
>> a specific description? thank you.
> 
> I'm afraid Juxiao's patch can't cover all scenarios. It addresses o2net 
> timeout scenario but not tcp socket resetting case.
> 
>>
>> In addition, as far as I know, TCP is reliable and trustworthy, TCP
>> will resend messages in a certain retransmit time. So as long as
>> o2net didn't active shutdown socket, TCP will resend message for
>> us.
>>
>> Thanks,
>> Yiwen Jiang.
> 
> Actually, TCP event doesn't begin to send packets from its send buffer 
> but closed due to underlying unknown reason. So we lose them.
> 
> 
> Thanks,
> Changwei
> 

I think firstly we should find the reason why tcp socket is reset/closed,
that is the underlying unknown reason you mentioned above, maybe it is
TCP bug. After analyzing, it is normal that tcp is closed in certain
condition, then we discuss the solution.

Thanks,
Yiwen Jiang.

>>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>>> won't happen since target node is still heartbeating and no dlm recovery
>>> procedure will be launched.
>>>
>>> So I think above cases drive us to improve current ocfs2/o2net making it
>>> more reliable. I already have a draft design for it. And we indeed need
>>> to change o2net behavior.
>>>
>>> To accomplish this goal, we tag each o2net message with a sequence
>>> ::msg_seq to let receiver tell if the newly coming message is a
>>> duplicated one or not and ::msg_seq will work as a key value for
>>> searching a following key structure in a red-black tree.
>>>
>>> A brandy new structure is added to o2net named as *Message Holder*, it
>>> is responsible for _handle_status_ storing.
>>>
>>> When TCP has to shutdown or reset due to unknown reason, although we
>>> lose the packets in send or receive buffer, o2net still manages those
>>> messages. This gives a chance to o2net to re-send the messages once TCP
>>> connection is established again.
>>>
>>> Below diagram demonstrates how it works:
>>>
>>> SEND					RECV
>>> send message				
>>> tag message header with ::msg_seq	
>>> 					search for Message Holder with
>>> 					  ::msg_seq
>>> 					NOT FOUND - insert one
>>> 					(FOUND - means a duplicated one)
>>> 					handle message
>>> 					store status into Message Holder
>>> 					send back status
>>> instruct RECV to remove MH
>>> 					notify SEND that MH is already
>>> 					  removed
>>> return to caller
>>>
>>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>>
>>> Thanks,
>>> Changwei.
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>>
>>
>>
>>
> 
> 
> .
> 





More information about the Ocfs2-devel mailing list