[Ocfs-users] Hard system restart when DRBD connection fails while in use

Sunil Mushran sunil.mushran at oracle.com
Sun Sep 7 17:43:49 PDT 2008


Repeat the test. This time run the following on Node A
after you have killed Node B.

$ ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN

If we are lucky we'll get to see where that process is waiting.

Henri Cook wrote:
> Hi all,
>
> I have two nodes (A+B) running a DRBD file system (using OCFS2) on /shared.
>
> If I start say, an FTP file transfer to my drbd /shared directory on node A, then reboot node B which is the other machine in a Primary-Primary DRBD configuration while the transfer is in progress - node A stops at a similar time that DRBD notices the connection with Node B has been lost (hence crippling both machines for the time it takes to reboot). If the drive is inactive (i.e. nothing is being written to it) then this does not occur.
>
> My question then is, could OCFS2 tools be the source of these reboots, is there any such default action configured? If so, how would I go about investigating/altering it?  There are no log entries about the reboot to speak of.
>
> OS is Ubuntu Hardy (Server) 8.04 and ocfs2-tools 1.3.9-0ubuntu1
>
> Thanks in advance,
>
> Henri
>
>
> _______________________________________________
> Ocfs-users mailing list
> Ocfs-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs-users
>   




More information about the Ocfs-users mailing list