[Ocfs2-users] How to force node [a] to consider node [b] dead?
Karim Alkhayer
kkhayer at gmail.com
Mon Jan 26 09:42:26 PST 2009
Hi All,
We have O2CB_HEARTBEAT_THRESHOLD set to 601 as the SAN gets overloaded
sometimes and hence causing the nodes to panic
This value has proven to be more stable than 31. However, there are
sometimes where one of the nodes, for instance node [b] crashes, for
whatever reason. While attempting to startup the troublesome node, auto
mount is enabled but doesn't succeed, "Transport endpoint is not connected"
is usually displayed.
My opinion is this: the mount doesn't succeed because node [a] still thinks
that node [b] is alive
We're talking about a restart that can take around 15 minutes, so basically,
the threshold is passed
I was wondering if there is a workaround to kick node [b] out of the cluster
so that it can join it again. What I've done so far, the incident happened
once - a month ago, is to restart the cluster services on both machines.
This was very expensive solution as all database instances had to go down
OCFS2 1.2.1, SLES9 SP3 2.6.5-7.257-default, RAC 10.1.0.5, 5 DBs
Thanks
Karim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090126/4e4637d1/attachment.html
More information about the Ocfs2-users
mailing list