[Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol
Srinivas Eeda
srinivas.eeda at oracle.com
Wed Feb 17 15:45:24 PST 2010
In old code a node cancels and re queues keep alive message when it
hears from the other node. If it didn't hear in 2 seconds, queued
message gets fired which sends a keep alive message. And a re queue
happens only after it hears from the other node.
With the new change, a node sends keep alive every 2 seconds.
Sunil Mushran wrote:
> How will it double? The node will send a keepalive only if it has
> not heard from the other node for 2 secs.
>
> Srinivas Eeda wrote:
>> No harm, just doubles heartbeat messages which is not required at all.
>>
>> Sunil Mushran wrote:
>>> What's the harm in leaving it in?
>>>
>>> Srinivas Eeda wrote:
>>>> Each node that has this patch would send a O2NET_MSG_KEEP_REQ_MAGIC
>>>> every 2 seconds(default). So, nodes without this patch would always
>>>> receive a heartbeat message every 2 seconds.
>>>>
>>>> Nodes without this patch will send(respond) with
>>>> O2NET_MSG_KEEP_RESP_MAGIC for every keep alive packet they
>>>> received. So nodes with this patch will always receive a response
>>>> message.
>>>>
>>>> So, in a mixed setup, both nodes will always hear the heartbeat
>>>> from each other :).
>>>>
>>>> thanks,
>>>> --Srini
>>>>
>>>>
>>>>
>>>> Joel Becker wrote:
>>>>
>>>>> On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:
>>>>>
>>>>>> case O2NET_MSG_KEEP_REQ_MAGIC:
>>>>>> - o2net_sendpage(sc, o2net_keep_resp,
>>>>>> - sizeof(*o2net_keep_resp));
>>>>>> + /* Each node now sends keepalive message every
>>>>>> + * keepalive time interval. Hence no need for response
>>>>>> + */
>>>>>> goto out;
>>>>>>
>>>>> You still have to send the response. Think about a mixed
>>>>> environment where some nodes have this fix and some do not. The
>>>>> older
>>>>> software is still waiting on the response.
>>>>> The newer version can just ignore any responses it gets from
>>>>> other nodes. But it has to send responses out just in case the other
>>>>> node is older.
>>>>> The only other alternative is to bump the o2net protocol
>>>>> version, and that means the cluster has to be shut down to
>>>>> upgrade. Not
>>>>> a good choice.
>>>>>
>>>>> Joel
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel at oss.oracle.com
>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>
>>>
>>
>
More information about the Ocfs2-devel
mailing list