[Ocfs2-users] 2 node OCFS2 clusters

Joel Becker Joel.Becker at oracle.com
Tue Nov 17 12:33:34 PST 2009


On Mon, Nov 16, 2009 at 12:57:26PM -0800, Luis Freitas wrote:
>    For CRS there is no need for a dedicated switch, only a need for using switches instead of cross cables. Although it is not recommended you can use the same switch for the public and private networks, using different vlans. The network status can be checked by the link status, that is what CRS does, and also by pinging the router. This information could be used as part of the heuristics to decide which node should survive. Of course it doesn't cover all network topologies, but it is sure better than node 0 always survive when network is down.

	How does node1 know the link status of node0?  That's the
fundamental problem of self-fencing.  You have to assume the other guy
is going to make a predictable decision.  node1 has no way of knowing
that node0 is going to reboot.
	What if the switch chip between node0 and node1 is down?  Both
see their links as up?
	Which link status do you check?  Do you consider your link
status down if the interconnect link is down or all links are down?
What if you have a separate public and private network, node1 has lost
public network and node0 has lost private?  In the current scheme, node1
resets and node0 continues talking on the public.  The web service is
working.  In your scheme node0 resets and node1 can't talk to the
public.  The web service is down.
	Self-fencing is hard and never perfect.  The two node case is
the worst because there is no difference between a majority of nodes and
all nodes.  The easiest way to alleviate it is to add a third node.  Now
you have a majority and much easier decisions.

>    I see this as a problem in a RAC implementation, since there are two different cluster stacks running (O2CB and CRS), they are not integrated and take decisions with a different heuristic. For me it would make more sense if they were integrated and one of the cluster stacks was in control, in the same way that happens when you use RAC with Veritas/HP ServiceGuard/Sun Cluster Suite, or OCFS2 with heartbeat2, for example.

	The standard install documentation makes sure that o2cb and crs
behave well together.  crs won't make a decision before o2cb does, thus
giving o2cb precedence.

Joel

-- 

 Joel's First Law:

	Nature abhors a GUI.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127



More information about the Ocfs2-users mailing list