[Ocfs2-users] Network 10 sec timeout setting?

Randy Ramsdell rramsdell at livedatagroup.com
Wed Feb 7 11:35:01 PST 2007


Sunil Mushran wrote:
> Means there was a network hiccup that caused Node 1 to fence itself.
> The problem is that our default timeout is too low. We have already
> addressed this in mainline and are looking to add that patch into 1.2.5.
>
> I am unclear as to your last qs.
>
> Randy Ramsdell wrote:
>> Hi,
>>
>> Ok I'll try this again since there seems to be more people reading this
>> list.
>>
>> I don't quite understand the log messages regarding fencing. Should the
>> other nodes in the cluster that lost network connectivity state
>> something about quorum/fencing etc...?
>> Is it true that the  network timeout param. can be set in 1.2.4 and if
>> not, can I change the setting myself before compile?
>> What will we see in logs if a node cannot write to the clusterfs but
>> heartbeat still works ?
>>

Sunil Mushran wrote:

> > Means there was a network hiccup that caused Node 1 to fence itself.
> > The problem is that our default timeout is too low. We have already
> > addressed this in mainline and are looking to add that patch into 1.2.5.
> >
> > I am unclear as to your last qs.
> >
> > Randy Ramsdell wrote:
>   
>> >> Hi,
>> >>
>> >> Ok I'll try this again since there seems to be more people reading this
>> >> list.
>> >>
>> >> I don't quite understand the log messages regarding fencing. Should the
>> >> other nodes in the cluster that lost network connectivity state
>> >> something about quorum/fencing etc...?
>> >> Is it true that the  network timeout param. can be set in 1.2.4 and if
>> >> not, can I change the setting myself before compile?
>> >> What will we see in logs if a node cannot write to the clusterfs but
>> >> heartbeat still works ?
>> >>
>>     
<snip>

I see we had a network hiccup ( actually it was the  load ), but I was
really trying to "iron out" the reason why our logs don't mention the
fencing. As a matter of fact, I have never seen the other nodes logging
a node fencing. Although I know it may be a small detail,  it is just 
interesting why I never see that message but many others do in this type
of situation.

The third question I asked was: What will we see in logs if a node
cannot write to the clusterfs but
heartbeat still works ?

As I understand it, there are 2 ways the clusters notifies nodes about
the cluster connectivity.
1. Heartbeat on port 7777.
2. Each nodes writes timestamps to the clusterfs.

I'm just a little fuzzy on this area.

RCR








More information about the Ocfs2-users mailing list