[Ocfs2-users] New node..new problems

Dante Garro dante at bigbyte.com.ar
Fri Oct 10 07:31:01 PDT 2008


Thanks Tao, Luis... Soony I'll change the Debian node so I'll be happy
again.

  _____  

De: Luis Freitas [mailto:lfreitas34 at yahoo.com] 
Enviado el: viernes, 10 de octubre de 2008 10:34
Para: Dante Garro
CC: 'ocfs2-users at oss.oracle.com'
Asunto: Re: [Ocfs2-users] New node..new problems


Dante,

   Your old debian is running OCFS 1.4 and your new Centos is running OCFS
1.2, right?

   If you are running Centos 5.0 you should be able to install OCFS 1.4. 

   If not you will need to umount your debian before mounting the Centos.
Beware that there are functionalities on OCFS 1.4 that are not available on
1.2, that might impact your applications.

   Also I am not sure if the disk layout is fully compatible if certain OCFS
1.4 filesystem options were enabled on your old cluster. The best option
would be to upgrade to OCFS 1.4 on the Centos cluster.

Regards,
Luis

--- On Fri, 10/10/08, Dante Garro <dante at bigbyte.com.ar> wrote:



From: Dante Garro <dante at bigbyte.com.ar>
Subject: Re: [Ocfs2-users] New node..new problems
To: "'Tao Ma'" <tao.ma at oracle.com>
Cc: "'ocfs2-users at oss.oracle.com'" <ocfs2-users at oss.oracle.com>
Date: Friday, October 10, 2008, 9:05 AM


Thanks Tao, I've setup the same on both nodes and the cluster becomes
online.
Now, when I try to mount the following errors appears on node 1 (new
CentOS):
(2512,1):o2net_connect_expired:1585 ERROR: no connection established with
node 0 after 30.0 seconds, giving up and returning errors.
(3022,1):dlm_request_join:901 ERROR: status = -107
(3022,1):dlm_try_to_join_domain:1049 ERROR: status = -107
(3022,1):dlm_join_domain:1321 ERROR: status = -107
(3022,1):dlm_register_domain:1514 ERROR: status = -107
(3022,1):ocfs2_dlm_init:2024 ERROR: status = -107
(3022,1):ocfs2_mount_volume:1133 ERROR: status = -107
ocfs2: Unmounting device (147,0) on (node 1)

And the following

 on node 0 (old Debian)

 (2228,0):o2net_check_handshake:1093 node nodo2 (num 1) at
192.168.168.2:7777 advertised net protocol version 103 but 2 is required,
disconnecting

I believe the Debian message is clear, protocol version incompatibility.

Are there a way to resolve it?

Thanks

Dante


-----Mensaje original-----
De: Tao Ma [mailto:tao.ma at oracle.com] 
Enviado el: viernes, 10 de octubre de 2008 1:25
Para: Dante Garro
CC: 'Sunil Mushran'; 'ocfs2-users at oss.oracle.com'
Asunto: Re: [Ocfs2-users] New node..new problems


Hi,
Dante Garro wrote:
> Sunil, now I fall in count of messages are related to node 0, but the 
> new is node 1 and does not care about the value I've setup allways
says
14000 ms.
> Do this change your diagnostic?
Node1 start connection with node0, so you see the messages related to node0
on node1. It looks like your configuration in

 node1 is wrong.
Please make sure that value of O2CB_HEARTBEAT_THRESHOLD in
/etc/sysconfig/o2cb of node1 is the same as that in node0.

Regards,
Tao

> 
> 
> -----Mensaje original-----
> De: Sunil Mushran [mailto:sunil.mushran at oracle.com] Enviado el: 
> Jueves, 09 de Octubre de 2008 06:02 p.m.
> Para: Dante Garro
> CC: 'ocfs2-users at oss.oracle.com'
> Asunto: Re: [Ocfs2-users] New node..new problems
> 
> Yeah the cluster timeouts are not consistent. Update and restart the 
> cluster on the new node (or all nodes as the case might be).
> 
> Hint: cat /sys/kernel/config/cluster/<clustername>/idle_timeout_ms
> to see the active heartbeat threshold.
> 
> Dante Garro wrote:
>> Hi all, because problems with ocfs2 release of Debian distribution 
>> decided to remake my cluster replacing it by CentOS

 based
installation.
>> Started replacing one of the nodes keeping the other working.
>>
>> On this recently created node the following errors appears:
>>
>> drbd0: Writing meta data super block now.
>> (2558,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
>> count of 14000 ms, but our count is 130000000 ms.
>> Please double check your configuration values for
> 'O2CB_HEARTBEAT_THRESHOLD'
>> OCFS2 1.2.9 Wed Sep 24 19:26:41 PDT 2008 (build
>> a693806cb619dd7f225004092b675ede)
>> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
>> with node 0 after 30.0 seconds, giving up and returning errors.
>> (2556,1):dlm_request_join:901 ERROR: status = -107
>> (2556,1):dlm_try_to_join_domain:1049 ERROR: status = -107
>> (2556,1):dlm_join_domain:1321 ERROR: status = -107
>>

 (2556,1):dlm_register_domain:1514 ERROR: status = -107
>> (2556,1):ocfs2_dlm_init:2024 ERROR: status = -107
>> (2556,1):ocfs2_mount_volume:1133 ERROR: status = -107
>> ocfs2: Unmounting device (147,0) on (node 1)
>> (2591,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
>> count of 14000 ms, but our count is 130000000 ms.
>> Please double check your configuration values for
> 'O2CB_HEARTBEAT_THRESHOLD'
>> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
>> with node 0 after 30.0 seconds, giving up and returning errors.
>> (2589,1):dlm_request_join:901 ERROR: status = -107
>> (2589,1):dlm_try_to_join_domain:1049 ERROR: status = -107
>> (2589,1):dlm_join_domain:1321 ERROR: status = -107
>> (2589,1):dlm_register_domain:1514 ERROR: status = -107
>> (2589,1):ocfs2_dlm_init:2024 ERROR: status =

 -107
>> (2589,1):ocfs2_mount_volume:1133 ERROR: status = -107
>> ocfs2: Unmounting device (147,0) on (node 1)
>>
>> I've changed the parameter O2CB_HEARTBEAT_THRESHOLD according O2CB

>> adviced me, but It don't resolve the issue.
>>
>> I hope someone could give me a clue.
>>
>> Thanks in advance.
>>
>> Dante
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>   
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users

_______________________________________________
Ocfs2-users mailing

 list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20081010/bdb684b8/attachment.html 


More information about the Ocfs2-users mailing list