[Ocfs2-users] ESX 3.5 DRS and OCFS2 1.4.1-1

David Murphy evilangel81 at gmail.com
Thu Dec 4 19:19:12 PST 2008


We are  getting:

 

Dec  4 17:19:41 web2 kernel: [9724159.177875] EXT2-fs warning: mounting
unchecked fs, running e2fsck is recommended

Dec  4 17:19:41 web2 kernel: [9724159.463691] VMware hgfs: HGFS is disabled
in the host

Dec  4 17:19:41 web2 kernel: [9724160.965637] OCFS2 Node Manager 1.3.3

Dec  4 17:19:41 web2 kernel: [9724161.033122] OCFS2 DLM 1.3.3

Dec  4 17:19:41 web2 kernel: [9724161.037686] OCFS2 DLMFS 1.3.3

Dec  4 17:19:41 web2 kernel: [9724161.038842] OCFS2 User DLM kernel
interface loaded

Dec  4 17:19:41 web2 kernel: [9724171.616652] o2net: accepted connection
from node rgapp1 (num 4) at 192.168.102.11:7777

Dec  4 17:19:41 web2 kernel: [9724171.722162] OCFS2 1.3.3

Dec  4 17:19:41 web2 kernel: [9724171.782112] ocfs2_dlm: Nodes in domain
("7D876A4B2EE14D0C8E1181E8DCF4237B"): 2 

Dec  4 17:19:41 web2 kernel: [9724171.782345] ocfs2_dlm: Node 4 joins domain
7D876A4B2EE14D0C8E1181E8DCF4237B

Dec  4 17:19:41 web2 kernel: [9724171.782348] ocfs2_dlm: Nodes in domain
("7D876A4B2EE14D0C8E1181E8DCF4237B"): 2 4 

Dec  4 17:19:41 web2 kernel: [9724171.782758] (4262,0):ocfs2_find_slot:268
slot 2 is already allocated to this node!

Dec  4 17:19:41 web2 kernel: [9724171.841264]
(4262,0):ocfs2_check_volume:1662 File system was not unmounted cleanly,
recovering volume.

Dec  4 17:19:41 web2 kernel: [9724171.841830] kjournald starting.  Commit
interval 5 seconds

Dec  4 17:19:41 web2 kernel: [9724171.880229] ocfs2: Mounting device (8,17)
on (node 2, slot 2) with ordered data mode.

Dec  4 17:19:43 web2 kernel: [9724175.991919] o2net: accepted connection
from node app1 (num 6) at 192.168.102.10:7777

Dec  4 17:19:45 web2 kernel: [9724178.086781] VMware memory control driver
initialized

Dec  4 17:19:46 web2 kernel: [9724178.235647] o2net: accepted connection
from node deploy (num 5) at 192.168.102.12:7777

Dec  4 17:19:50 web2 kernel: [9724182.319762] ocfs2_dlm: Node 6 joins domain
7D876A4B2EE14D0C8E1181E8DCF4237B

Dec  4 17:19:50 web2 kernel: [9724182.319773] ocfs2_dlm: Nodes in domain
("7D876A4B2EE14D0C8E1181E8DCF4237B"): 2 4 6 

Dec  4 17:19:50 web2 kernel: [9724182.598848] ocfs2_dlm: Node 5 joins domain
7D876A4B2EE14D0C8E1181E8DCF4237B

Dec  4 17:19:50 web2 kernel: [9724182.598853] ocfs2_dlm: Nodes in domain
("7D876A4B2EE14D0C8E1181E8DCF4237B"): 2 4 5 6 

Dec  4 17:21:32 web2 syslogd 1.5.0#1ubuntu1: restart.

 

 

 

 

This completely froze the entire cluster, when ESX tried to v-motion 3 of 6
nodes to a new host. 

Is it recommended by Oracle not to enable DRS on virtual machine using the
cluster, or is there a configuration we can use to keep crashes like this
from happening all the time.

 

I have seen several posts suggesting that disabling DRS would be a "way to
workaround" this issue but not really a good practice as you would loose a
lot of your HA abilities.

 

Also is there a way to have OCFS2 drop a node from the cluster if a new node
comes online with its ID?

 

David Murphy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20081204/dd6cd8bd/attachment.html 


More information about the Ocfs2-users mailing list