[Ocfs2-users] 6 node cluster with unexplained reboots

Wed Aug 15 16:49:00 PDT 2007

On Mon, Aug 13, 2007 at 08:46:51AM -0700, Ulf Zimmermann wrote:
> Index 22: took 10003 ms to do waiting for write completion
> *** ocfs2 is very sorry to be fencing this system by restarting ***
> 
> There were no SCSI errors on the console or logs around the time of this
> reboot.

It looks like the write took too long - as a first step, you might want to
up the disk heartbeat timeouts on those systems. Run:

$ /etc/init.d/o2cb configure

on each node to do that. That won't hide any hardware problems, but if the
problem is just a latency to get the write to disk, it'd help tune it away.
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh at oracle.com