[Ocfs2-users] Replication not works

Randy Ramsdell rramsdell at livedatagroup.com
Mon Aug 13 14:35:48 PDT 2007


Yohan wrote:
> Randy Ramsdell a écrit :
>> Yohan wrote:
>>  
>>> Randy Ramsdell a écrit :
>>>    
>>>> Yohan wrote:
>>>>  
>>>>      
>>>>> Hi,
>>>>>
>>>>>    I have 2 nodes.
>>>>> I do:
>>>>>
>>>>> /etc/init.d/ocfs2 load
>>>>> /etc/init.d/ocfs2 online ocfs2
>>>>>
>>>>> I enable all the debug options via debugfs.ocfs2
>>>>> and
>>>>>
>>>>> - never, t1 try to connect to t2...
>>>>> - when i telnet t2 from t1 , t2 says :
>>>>> "
>>>>> (5679,2):o2hb_check_node_heartbeating_from_callback:1837 node (6)
>>>>> does
>>>>> not have heartbeating enabled.
>>>>> (5679,2):o2net_accept_one:1720 attempt to connect from node 't1' at
>>>>> 172.20.245.231:33032 but it isn't heartbeating
>>>>>             
>>>> I think you forgot /etc/init.d/o2cb start.  "/etc/init.d/ocfs2" only
>>>> handles mounting the cluster volume.
>>>> You may have to do: "/etc/init.d/o2cb configure" then start.
>>>>
>>>> rcr
>>>>         
>>> It's just a mistake on my 1st mail.
>>>
>>> We do:
>>> /etc/init.d/o2cb load
>>> /etc/init.d/o2cb online ocfs2
>>> & mount.ocfs2 /dev/sda3 /mnt/sda3/
>> Well actually I made a mistake. No doubt you cannot mount without first
>> dealing with o2cb.
>>
>> I haven't seen this issue before, but maybe provide the output for o2cb
>> status and ocfs2 status. After that, maybe Sunil would help.
>>   
> After load / online / mount on the 2 nodes:
>
> root at t1:~# /etc/init.d/o2cb status
> Module "configfs": Loaded
> Filesystem "configfs": Mounted
> Module "ocfs2_nodemanager": Loaded
> Module "ocfs2_dlm": Loaded
> Module "ocfs2_dlmfs": Loaded
> Filesystem "ocfs2_dlmfs": Mounted
> Checking O2CB cluster ocfs2: Online
> Heartbeat dead threshold = 7
>  Network idle timeout: 10000
>  Network keepalive delay: 5000
>  Network reconnect delay: 2000
> Checking O2CB heartbeat: Active
>
> root at t1:~# /etc/init.d/ocfs2 status
> Active OCFS2 mountpoints:  /mnt/sda3
>
> root at t2:~# /etc/init.d/o2cb status
> Module "configfs": Loaded
> Filesystem "configfs": Mounted
> Module "ocfs2_nodemanager": Loaded
> Module "ocfs2_dlm": Loaded
> Module "ocfs2_dlmfs": Loaded
> Filesystem "ocfs2_dlmfs": Mounted
> Checking O2CB cluster ocfs2: Online
> Heartbeat dead threshold = 7
>  Network idle timeout: 10000
>  Network keepalive delay: 5000
>  Network reconnect delay: 2000
> Checking O2CB heartbeat: Active
>
> root at t2:~# /etc/init.d/ocfs2 status
> Active OCFS2 mountpoints:  /mnt/sda3
>
> I tryied many things like:
>
> telnet t1 7777 from t2 => error , normal (
> (3091,0):o2net_accept_one:1709 unexpected connect attempted from a
> lower numbered node 't2' at 172.20.245.232:57991 with num 3 )
>
This looks normal, but I did not understand why the "node number "  is
3. I used node numbers (0,1,2,3,4,5) in order.

> telnet t2 7777 from t1 => error on the debug logs :
> (2968,3):o2hb_check_node_heartbeating_from_callback:1837 node (6) does
> not have heartbeating enabled.
> (2968,3):o2net_accept_one:1720 attempt to connect from node 't1' at
> 172.20.245.231:42944 but it isn't heartbeating
>
> I think there are the problem but i can't understand why... ?
This is the kicker or real issue.

A stab in the dark. Change the cluster.conf and put the nodes in order
and starting from zero. So your nodes will be (0,1). After that I really
don't know. The devs really need to  come into this.




More information about the Ocfs2-users mailing list