[Ocfs2-users] Catatonic nodes under SLES10
Alexei_Roudnev
Alexei_Roudnev at exigengroup.com
Tue Apr 10 12:52:11 PDT 2007
Moreover.
(If we hajacked this thread already, can we proceed for 1 more day on it?)
A) the node is well behaving, just loses connection to part of the
storage - no fencing is needed, you just dont issue more requests and
remeber the new node state, the other nodes might recognize
recovery/replay is needed. Nobody (especially not the fencing node) can
know which part of the last IO transaction will reach the device (or
not) anyway. Thats why you have to have IO transactions with atomic
changes and integrity. Also there is no need to dequeue the last IO
request for exactly that reason - it must not be harmfull anyway.
You can, in such case, ask another nodes to make IO for you, and it can help in MANY cases (at least it can help to close objects if necessary).
Risk of _pending IO can pass thru_ is sugnificant in such case no matter how you fence (problem is that if storage wake up suddenly, it can proceed old
writes which have been sent long ago). Fencing may be necessary if there was (or there is) pending writes (except heartbeats which should be treated differently) to the file structures (but not to the file blocks, they are safe from FS point of view).
But if you know for sure that all IO passed thru (and was confirmed) then risk is almost zero and you can avoid fencing.
B) the node is totally borken (memory overwrite, interrupt deadlocks,
etc). In that case the note might not notice the failure, or it might
not be able to self-fence. This is a typical case for STONIT.
Interesting question - does handcheck_timer helps in such cases?
C) the node is well just the heartbeat is delayed, or the overall nodes
are well and the cache is in sync, only a single disk storage fails.
Those cases should never occur (larger timeouts are only part of the
solution, a smarter quorum algorithm like provided with heartbeat or
other cluster managers is needed). That case was happenign quite often,
I guess increased timeouts make this a bit better, but cant be reliable
solved with a cluster framework which considers more thant the state of
of a single network connection.
One of the Problems OCFSv2 has is, that the condition c) happened too
often and that the condition a) is not recognizeable. And of course that
all those conditions are handled in the simplest way, with a panic which
is IMHO not really helpful for the above mention reasons.
I'd better add 2 more confitions:
D) All nodes lost connection to the storage disk. They should know it from each other and so wait
until at least one node restore connection (in most cases all nodes will got connection at once).
E) Node lost all connections to both other nodes and disks. IT can avoid fencing ONLY if it had not any pending IO for a long.
Problem is that A) and D) are not recognized at all, and C) happens too often.
Gruss
Bernd
SEEBURGER AG
Headquarters:
Edisonstraße 1
D-75015 Bretten
Tel.: 0 72 52/96-0
Fax: 0 72 52/96-2222
Internet: http://www.seeburger.de
e-mail: info at seeburger.de
Vorstand:
Bernd Seeburger, Axel Haas, Michael Kleeberg
Vorsitzender des Aufsichtsrats:
Dr. Franz Scherer
Handelsregister:
HRB 240708 Mannheim
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20070410/3610f315/attachment.html
More information about the Ocfs2-users
mailing list