[Ocfs2-users] How to force node [a] to consider node [b] dead?

Sunil Mushran sunil.mushran at oracle.com
Mon Jan 26 09:52:28 PST 2009


You are running a 3 year old version of the fs. Please upgrade
to something more current. Like sles9 sp4 or sles10 sp1 that
bundles ocfs2 1.2.9, or sles10 sp2 that ships ocfs2 1.4.1.

Karim Alkhayer wrote:
>
> Hi All,
>
> We have O2CB_HEARTBEAT_THRESHOLD set to 601 as the SAN gets overloaded 
> sometimes and hence causing the nodes to panic
>
> This value has proven to be more stable than 31. However, there are 
> sometimes where one of the nodes, for instance node [b] crashes, for 
> whatever reason. While attempting to startup the troublesome node, 
> auto mount is enabled but doesn’t succeed, “Transport endpoint is not 
> connected” is usually displayed.
>
> My opinion is this: the mount doesn’t succeed because node [a] still 
> thinks that node [b] is alive
>
> We’re talking about a restart that can take around 15 minutes, so 
> basically, the threshold is passed
>
> I was wondering if there is a workaround to kick node [b] out of the 
> cluster so that it can join it again. What I’ve done so far, the 
> incident happened once - a month ago, is to restart the cluster 
> services on both machines. This was very expensive solution as all 
> database instances had to go down
>
> OCFS2 1.2.1, SLES9 SP3 2.6.5-7.257-default, RAC 10.1.0.5, 5 DBs
>
> Thanks
>
> Karim
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users




More information about the Ocfs2-users mailing list