[Ocfs2-users] Any ideas about ocfs2 fence senario,thanks

Thu Jul 5 01:40:17 PDT 2012

On Wed, Jul 04, 2012 at 09:40:26AM +0000, Guozhonghua wrote:
> Hi, Everyone,
> 
> I have a scenario using ocfs2 with a question.
> 
> If I configure an OCFS2 cluster with three hosts using two ocfs2 disks over iSCSI with different two targets, and they are in different networks.
> 　　
> 　　Host1
> 　　  | ---/dev/sdb ------ocfs2---------iscsi target1, iscsi's ip address 192.168.10.1
> 　　  | ---/dev/sdc ------ocfs2---------iscsi target2, iscsi's ip address 192.168.20.1
> 　　
> 　　Host2
> 　　  | ---/dev/sdb ------ocfs2---------iscsi target1, iscsi's ip address 192.168.10.1
> 　　  | ---/dev/sdc ------ocfs2---------iscsi target2, iscsi's ip address 192.168.20.1
> 　　
> 　　Host3
> 　　  | ---/dev/sdb ------ocfs2---------iscsi target1, iscsi's ip address 192.168.10.1
> 　　  | ---/dev/sdc ------ocfs2---------iscsi target2, iscsi's ip address 192.168.20.1
> 　　
> 　　So host1, host2, host3 are the nodes of the ocfs2 cluster, and every one using two ocfs2 disks over iSCSI, and they constructed one cluster for HA.
> 　　
> 　　Once Host1 fences itself for network disconnection with other two nodes host2 and host3, or it is timeout of its heartbeat writing timestamp to one targets of the iSCSI sdb. The jobs such as virtual machine's data file on sdb will be restarted for the fence of Host1.
> 　　At this time, some other jobs on Host1 are running with the disk sdc, such as database of mysql will be also be restarted. Is there any way to avoid this scenario?
> I mean the jobs running with different disks will be not affected for the host fence of ocfs2. Such as the mysql database on the disk sdc will not restart for the sdb's fence.
> And is there any way to unmount or dicard the sdb disk without restart the host? So the service running with the disk sdc will not be influenced.

The short answer is 'no'.  When sdb goes missing, the host has unclean
state.  If it is allowed to continue, it might later write its
out-of-date data to sdb when the network is healthy again.  This causes
data corruption.  Instead, we reboot the node.

Joel

-- 

"But all my words come back to me
 In shades of mediocrity.
 Like emptiness in harmony
 I need someone to comfort me."

			http://www.jlbec.org/
			jlbec at evilplan.org