[Ocfs2-users] problem stopping o2cb service on one of nodes

Nikola Ciprich extmaillist at linuxbox.cz
Fri Apr 3 13:33:30 PDT 2009


Hi Sunil,
thanks for reply..
I don't observe any segfaults...
regarding info You want, as I wrote, umount doesn't decrease refcount...:

[root at vbox4 ~]# ocfs2_hb_ctl -I -d /dev/vgshared/lvs
2A5D351D0A934061BBC6B5392A30187E: 1 refs
[root at vbox4 ~]# umount /home/LVS
[root at vbox4 ~]# ocfs2_hb_ctl -I -d /dev/vgshared/lvs
2A5D351D0A934061BBC6B5392A30187E: 1 refs

nik

On Fri, Apr 03, 2009 at 10:21:33AM -0700, Sunil Mushran wrote:
> umount is supposed to stop the heartbeat. In bz1053, ocfs2_hb_ctl was  
> segfaulting.
> Are you seeing any segfaults or any other errors during umount?
>
> Also, run the following before and after umount:
> $ ocfs2_hb_ctl -I -d /dev/sdX o2cb
>
> Email me the output.
>
> Nikola Ciprich wrote:
>> Hello Tao,
>> and thanks a lot for reply!
>> It seems not to be the same bug, at least applying the patch didn't help.
>> stopping hb using -K parameter really helps, but why doesn't this work automatically
>> on umount?
>> it always happens on the second node...
>> I don't see any error in logs, anything.
>> But the reference count always increases on mount, and doesn't decrease on umount on this node..
>>
>>
>> On Fri, Apr 03, 2009 at 10:58:18AM +0800, Tao Ma wrote:
>>   
>>> Hi Nikola,
>>>
>>> Nikola Ciprich wrote:
>>>     
>>>> Hi,
>>>> I'm trying ocfs2 RHEL5 distro, 2.6.29 kernel, ocfstools-1.4.1. I'm using DRBD in primary/primary mode
>>>> as shared storage...
>>>>
>>>> I've configured the service according to quickstart document, and everything works,
>>>> but when I umount fs on both nodes, stopping o2cb service on one of the nodes always
>>>> fails with:
>>>>
>>>> [root at vbox4 sysconfig]# /etc/rc.d/init.d/o2cb stop
>>>> Stopping O2CB cluster vb34: Failed
>>>> Unable to stop cluster as heartbeat region still active
>>>>       
>>> It looks that your disk heartbeat is still there. I don't know the   
>>> specific reason, maybe   
>>> http://oss.oracle.com/bugzilla/show_bug.cgi?id=1053 ?
>>>
>>> but you can stop it manually.
>>> 1.  ocfs2_hb_ctl -I -d <device>
>>> or ocfs2_hb_ctl -I -u <uuid>
>>> this will tell you the reference number for the hearbeat.
>>> 2.  ocfs2_hb_ctl -K -d <device> <service>
>>>   or  ocfs2_hb_ctl -K -u <uuid> <service>
>>> this will killed the heartbeat manually.
>>> service is the stack you used, and it should be "o2cb" in your case.
>>>
>>> btw, you can try cfs2_hb_ctl -K -u <uuid> <service> to see whether it 
>>> is  the same problem as bug 1053.
>>>
>>> Regards,
>>> Tao
>>>     
>

-- 
-------------------------------------
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis at linuxbox.cz
-------------------------------------



More information about the Ocfs2-users mailing list