[Ocfs2-users] What does "shutting it down" mean
Alexei_Roudnev
Alexei_Roudnev at exigengroup.com
Tue Jul 24 13:38:27 PDT 2007
Kendall; you must understand that THERE IS NOT A WAY to build 100% RELIABLE
multi-node cluster without direct fencing (when other nodes can kick failed
node down).
o2cb uses self-fencing technique and some other ways to make sure that it
can get rid of failed node, but there is still some chance of overall
cluster failure if these node did not died totally.
If you need extremely high reliability, think about using heartbeat2 + ocfs2
version (it works in sles10, for example, but I can't confirm it's quality)
or even commercial version. Ideally (as with heartbeat*), if you have nodes
1,2,3 and nodes1,2 decided that node3 is dead, they run external program
(STONIT) which resets node-3 by hardware (more likely, by power reset) way
or isolate it from the disk system.
Without such technique, there is always some chance (even if very small one)
that node3 is not really dead and can do something bad (from his delayed IO)
before it's own o2cb detect lost of quorum and restart it internally.
----- Original Message -----
From: "Kendall, Kim" <Kim_Kendall at inter-tel.com>
To: <ocfs2-users at oss.oracle.com>
Sent: Tuesday, July 24, 2007 1:05 PM
Subject: [Ocfs2-users] What does "shutting it down" mean
When one node in the cluster can't be seen by the others, the remaining
nodes chop it off at the knees. What are the remaining nodes in the
cluster actually doing when they say "shutting it down"?
>From the logs:
Jul 23 18:33:42 appsdb3 kernel: o2net: connection to node appsdb4 (num
3) at 192.168.202.4:7777 has been idle for 10 seconds, shutting it down.
Jul 23 18:33:42 appsdb3 kernel: o2net: no longer connected to node
appsdb4 (num 3) at 192.168.202.4:7777
The information contained in this E-mail may be confidential and/or
proprietary to Inter-Tel and/or its affiliates. The information transmitted
herewith is intended only for use by the individual or entity to which it is
addressed. If the reader of this message is not the intended recipient, you
are hereby notified that any review, retransmission, dissemination,
distribution, copying or other use of, or taking of any action in reliance
upon this information is strictly prohibited. If you have received this
communication in error, please contact the sender and delete the material
from your computer.
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users
More information about the Ocfs2-users
mailing list