[Ocfs-users] Hard system restart when DRBD connection fails while in use

Sunil Mushran sunil.mushran at oracle.com
Sun Sep 7 17:53:07 PDT 2008


The fencing mechanism is meant to avoid disk corruptions. If you extend the
disk heartbeat to 2 days, then if a node dies, the cluster will hang for 
2 days.
The timeout is configurable. Details are in the 1.2 FAQ and 1.4 user's 
guide.

Henri Cook wrote:
> Dear Sunil,
>
> It is OCFS2 - I found the code, it's the self-fencing mechanism that
> simply reboots the node - if I alter the OCFS2 timeout, the reboot is
> delayed by that many seconds. It's a real shame, i'm going to have to
> try to work with it - probably by extending the node timeout to 2 days
> or something - with DRBD I don't see the need for OCFS2 to be rebooting
> or anything really as DRBD takes care of block device synchronisation -
> I just wish this behaviour was configureable!
>
> Henri
>
> Sunil Mushran wrote:
>   
>> Repeat the test. This time run the following on Node A
>> after you have killed Node B.
>>
>> $ ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN
>>
>> If we are lucky we'll get to see where that process is waiting.
>>
>> Henri Cook wrote:
>>     
>>> Hi all,
>>>
>>> I have two nodes (A+B) running a DRBD file system (using OCFS2) on
>>> /shared.
>>>
>>> If I start say, an FTP file transfer to my drbd /shared directory on
>>> node A, then reboot node B which is the other machine in a
>>> Primary-Primary DRBD configuration while the transfer is in progress
>>> - node A stops at a similar time that DRBD notices the connection
>>> with Node B has been lost (hence crippling both machines for the time
>>> it takes to reboot). If the drive is inactive (i.e. nothing is being
>>> written to it) then this does not occur.
>>>
>>> My question then is, could OCFS2 tools be the source of these
>>> reboots, is there any such default action configured? If so, how
>>> would I go about investigating/altering it?  There are no log entries
>>> about the reboot to speak of.
>>>
>>> OS is Ubuntu Hardy (Server) 8.04 and ocfs2-tools 1.3.9-0ubuntu1
>>>
>>> Thanks in advance,
>>>
>>> Henri
>>>
>>>
>>> _______________________________________________
>>> Ocfs-users mailing list
>>> Ocfs-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs-users
>>>   
>>>       




More information about the Ocfs-users mailing list