[Ocfs2-users] sanity check - Xen+iSCSI+LVM+OCFS2 at dom0/domU

Alok Dhir adhir at symplicity.com
Thu Feb 7 10:30:27 PST 2008


Ah - thanks for the clarification.

I'm left with one perplexing problem - on one of the hosts, 'devxen0',  
o2cb refuses to start.  The box is identically configured to at least  
2 other cluster hosts and all were imaged the exact same way, except  
that devxen0 has 32GB RAM where the others have 16 or less.

Any clues where to look?

--
[root at devxen0:~] service o2cb enable
Writing O2CB configuration: OK
Starting O2CB cluster ocfs2: Failed
Cluster ocfs2 created
Node beast added
o2cb_ctl: Internal logic failure while adding node devxen0

Stopping O2CB cluster ocfs2: OK
--

This is in syslog when this happens:

Feb  7 13:26:50 devxen0 kernel: (17194,6):o2net_open_listening_sock: 
1867 ERROR: unable to bind socket at 196.168.1.72:7777, ret=-99

--

Box config:

[root at devxen0:~] uname -a
Linux devxen0.symplicity.com 2.6.18-53.1.6.el5xen #1 SMP Wed Jan 23  
11:59:21 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

--

Here is cluster.conf:

---
node:
	ip_port = 7777
	ip_address = 192.168.1.62
	number = 0
	name = beast
	cluster = ocfs2

node:
	ip_port = 7777
	ip_address = 196.168.1.72
	number = 1
	name = devxen0
	cluster = ocfs2

node:
	ip_port = 7777
	ip_address = 192.168.1.73
	number = 2
	name = devxen1
	cluster = ocfs2

node:
	ip_port = 7777
	ip_address = 192.168.1.74
	number = 3
	name = devxen2
	cluster = ocfs2

node:
	ip_port = 7777
	ip_address = 192.168.1.70
	number = 4
	name = fs1
	cluster = ocfs2

node:
	ip_port = 7777
	ip_address = 192.168.1.71
	number = 5
	name = fs2
	cluster = ocfs2

node:
	ip_port = 7777
	ip_address = 192.168.1.80
	number = 6
	name = vdb1
	cluster = ocfs2

cluster:
	node_count = 7
	name = ocfs2
---



On Feb 7, 2008, at 1:23 PM, Sunil Mushran wrote:

> Yes, but backported and released as ocfs2 1.4 which is yet to be  
> released.
> You are on ocfs2 1.2.
>
> Alok Dhir wrote:
>> I've seen that -- I was under the impression that some of those  
>> were being backported into the release kernels.
>>
>> Thanks,
>>
>> Alok
>>
>> On Feb 7, 2008, at 1:15 PM, Sunil Mushran wrote:
>>
>>> http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2-new-features.html
>>>
>>> Alok Dhir wrote:
>>>> We were indeed using a self-built module due to the lack of an  
>>>> OSS one for the latest kernel.  Thanks for your response, I will  
>>>> test with the new version.
>>>>
>>>> What are we leaving on the table by not using the latest mainline  
>>>> kernel?
>>>>
>>>> On Feb 7, 2008, at 12:56 PM, Sunil Mushran wrote:
>>>>
>>>>> Are you building ocfs2 with this kernel or are using the ones we
>>>>> provide for RHEL5?
>>>>>
>>>>> I am assuming you have built it yourself as we did not release
>>>>> packages for the latest 2.6.18-53.1.6 kernel till last night.
>>>>>
>>>>> If you are using your own, then use the one from oss.
>>>>>
>>>>> If you are using the one from oss, then file a bugzilla with the
>>>>> full oops trace.
>>>>>
>>>>> Thanks
>>>>> Sunil
>>>>>
>>>>> Alok K. Dhir wrote:
>>>>>> Hello all - we're evaluating OCFS2 in our development  
>>>>>> environment to see if it meets our needs.
>>>>>>
>>>>>> We're testing it with an iSCSI storage array (Dell MD3000i) and  
>>>>>> 5 servers running Centos 5.1 (2.6.18-53.1.6.el5xen).
>>>>>>
>>>>>> 1) Each of the 5 servers is running the Centos 5.1 open-iscsi  
>>>>>> initiator, and sees the volumes exposed by the array just  
>>>>>> fine.  So far so good.
>>>>>>
>>>>>> 2) Created a volume group using the exposed iscsi volumes and  
>>>>>> created a few LVM2 logical volumes.
>>>>>>
>>>>>> 3) vgscan; vgchange -a y; on all the cluster members.  all see  
>>>>>> the "md3000vg" volume group.  looking good. (we have no  
>>>>>> intention of changing the LVM2 configurations much if at all,  
>>>>>> and can make sure all such changes are done when the volumes  
>>>>>> are off-line on all cluster members, so theoretically this  
>>>>>> should not be a problem).
>>>>>>
>>>>>> 4) mkfs.ocfs2 /dev/md3000vg/testvol0 -- works great
>>>>>>
>>>>>> 5) mount on all Xen dom0 boxes in the cluster, works great.
>>>>>>
>>>>>> 6) create a VM on one of the cluster members, set up iscsi,  
>>>>>> vgscan, md3000vg shows up -- looking good.
>>>>>>
>>>>>> 7) install ocfs2, 'service o2cb enable', starts up fine.   
>>>>>> mount /dev/md3000vg/testvol0, works fine.
>>>>>>
>>>>>> ** Thanks for making it this far -- this is where is gets  
>>>>>> interesting
>>>>>>
>>>>>> 8) run 'iozone' in domU against ocfs2 share - BANG - immediate  
>>>>>> kernel panic, repeatable all day long.
>>>>>>
>>>>>>  "kernel BUG at fs/inode.c"
>>>>>>
>>>>>> So my questions:
>>>>>>
>>>>>> 1) should this work?
>>>>>>
>>>>>> 2) if not, what should we do differently?
>>>>>>
>>>>>> 3) currently we're tracking the latest RHEL/Centos 5.1 kernels  
>>>>>> -- would we have better luck using the latest mainline kernel?
>>>>>>
>>>>>> Thanks for any assistance.
>>>>>>
>>>>>> Alok Dhir
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ocfs2-users mailing list
>>>>>> Ocfs2-users at oss.oracle.com
>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Ocfs2-users mailing list
>>>> Ocfs2-users at oss.oracle.com
>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>
>




More information about the Ocfs2-users mailing list