[Ocfs2-users] What does "shutting it down" mean

Alexei_Roudnev Alexei_Roudnev at exigengroup.com
Tue Jul 24 13:38:27 PDT 2007


Kendall; you must understand that THERE IS NOT A WAY to build 100% RELIABLE 
multi-node cluster without direct fencing (when other nodes can kick failed 
node down).

o2cb uses self-fencing technique and some other ways to make sure that it 
can get rid of failed node, but there is still some chance of overall 
cluster failure if these node did not died totally.

If you need extremely high reliability, think about using heartbeat2 + ocfs2 
version (it works in sles10, for example, but I can't confirm it's quality) 
or even commercial version. Ideally (as with heartbeat*), if you have nodes 
1,2,3 and nodes1,2 decided that node3 is dead, they run external program 
(STONIT) which resets node-3 by hardware (more likely, by power reset) way 
or isolate it from the disk system.

Without such technique, there is always some chance (even if very small one) 
that node3 is not really dead and can do something bad (from his delayed IO) 
before it's own o2cb detect lost of quorum and restart it internally.



----- Original Message ----- 
From: "Kendall, Kim" <Kim_Kendall at inter-tel.com>
To: <ocfs2-users at oss.oracle.com>
Sent: Tuesday, July 24, 2007 1:05 PM
Subject: [Ocfs2-users] What does "shutting it down" mean


When one node in the cluster can't be seen by the others, the remaining
nodes chop it off at the knees. What are the remaining nodes in the
cluster actually doing when they say "shutting it down"?

>From the logs:

Jul 23 18:33:42 appsdb3 kernel: o2net: connection to node appsdb4 (num
3) at 192.168.202.4:7777 has been idle for 10 seconds, shutting it down.

Jul 23 18:33:42 appsdb3 kernel: o2net: no longer connected to node
appsdb4 (num 3) at 192.168.202.4:7777

The information contained in this E-mail may be confidential and/or 
proprietary to Inter-Tel and/or its affiliates. The information transmitted 
herewith is intended only for use by the individual or entity to which it is 
addressed. If the reader of this message is not the intended recipient, you 
are hereby notified that any review, retransmission, dissemination, 
distribution, copying or other use of, or taking of any action in reliance 
upon this information is strictly prohibited. If you have received this 
communication in error, please contact the sender and delete the material 
from your computer.

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users




More information about the Ocfs2-users mailing list