[Ocfs2-users] problem stopping o2cb service on one of nodes

Sat Apr 4 01:24:18 PDT 2009

Hi,
it says:
/sbin/ocfs2_hb_ctl
on both nodes, which's correct - the binary is there...
n.

On Fri, Apr 03, 2009 at 02:27:34PM -0700, Sunil Mushran wrote:
> Do:
> $ cat /proc/sys/fs/ocfs2/nm/hb_ctl_path
>
>
> Nikola Ciprich wrote:
>> Hi Sunil,
>> thanks for reply..
>> I don't observe any segfaults...
>> regarding info You want, as I wrote, umount doesn't decrease refcount...:
>>
>> [root at vbox4 ~]# ocfs2_hb_ctl -I -d /dev/vgshared/lvs
>> 2A5D351D0A934061BBC6B5392A30187E: 1 refs
>> [root at vbox4 ~]# umount /home/LVS
>> [root at vbox4 ~]# ocfs2_hb_ctl -I -d /dev/vgshared/lvs
>> 2A5D351D0A934061BBC6B5392A30187E: 1 refs
>>
>> nik
>>
>> On Fri, Apr 03, 2009 at 10:21:33AM -0700, Sunil Mushran wrote:
>>   
>>> umount is supposed to stop the heartbeat. In bz1053, ocfs2_hb_ctl was 
>>>  segfaulting.
>>> Are you seeing any segfaults or any other errors during umount?
>>>
>>> Also, run the following before and after umount:
>>> $ ocfs2_hb_ctl -I -d /dev/sdX o2cb
>>>
>>> Email me the output.
>>>
>>> Nikola Ciprich wrote:
>>>     
>>>> Hello Tao,
>>>> and thanks a lot for reply!
>>>> It seems not to be the same bug, at least applying the patch didn't help.
>>>> stopping hb using -K parameter really helps, but why doesn't this work automatically
>>>> on umount?
>>>> it always happens on the second node...
>>>> I don't see any error in logs, anything.
>>>> But the reference count always increases on mount, and doesn't decrease on umount on this node..
>>>>
>>>>
>>>> On Fri, Apr 03, 2009 at 10:58:18AM +0800, Tao Ma wrote:
>>>>         
>>>>> Hi Nikola,
>>>>>
>>>>> Nikola Ciprich wrote:
>>>>>             
>>>>>> Hi,
>>>>>> I'm trying ocfs2 RHEL5 distro, 2.6.29 kernel, ocfstools-1.4.1. I'm using DRBD in primary/primary mode
>>>>>> as shared storage...
>>>>>>
>>>>>> I've configured the service according to quickstart document, and everything works,
>>>>>> but when I umount fs on both nodes, stopping o2cb service on one of the nodes always
>>>>>> fails with:
>>>>>>
>>>>>> [root at vbox4 sysconfig]# /etc/rc.d/init.d/o2cb stop
>>>>>> Stopping O2CB cluster vb34: Failed
>>>>>> Unable to stop cluster as heartbeat region still active
>>>>>>                 
>>>>> It looks that your disk heartbeat is still there. I don't know 
>>>>> the   specific reason, maybe    
>>>>> http://oss.oracle.com/bugzilla/show_bug.cgi?id=1053 ?
>>>>>
>>>>> but you can stop it manually.
>>>>> 1.  ocfs2_hb_ctl -I -d <device>
>>>>> or ocfs2_hb_ctl -I -u <uuid>
>>>>> this will tell you the reference number for the hearbeat.
>>>>> 2.  ocfs2_hb_ctl -K -d <device> <service>
>>>>>   or  ocfs2_hb_ctl -K -u <uuid> <service>
>>>>> this will killed the heartbeat manually.
>>>>> service is the stack you used, and it should be "o2cb" in your case.
>>>>>
>>>>> btw, you can try cfs2_hb_ctl -K -u <uuid> <service> to see 
>>>>> whether it is  the same problem as bug 1053.
>>>>>
>>>>> Regards,
>>>>> Tao
>>>>>             
>>
>>   
>

-- 
-------------------------------------
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis at linuxbox.cz
-------------------------------------