[Ocfs2-users] Heartbeat stays active & stops o2cb shutdown

Sunil Mushran sunil.mushran at oracle.com
Tue Jun 28 19:41:09 PDT 2011


Manually umounting /dlm is not a good idea. Let the o2cb script handle that.
It'll be easier to diagnose if you followed the steps I had listed.

On 06/28/2011 06:41 PM, Shave, Chris wrote:
> Thanks for the info on how to shut the heartbeat down..
> I haven't had a chance to test it as of yet.
> I had previously found info on /dlm being mounted, I unmounted it on 
> both nodes & heartbeat still stayed active.
> Also post a reboot, the process of offlining the cluster did not 
> return an error anymore but a status check still indicated heartbeat 
> was active & an attempt to unload then threw back the heartbeat active 
> error.
>
> *Christopher Shave*, Global UNIX/Linux Projects Team
> *Marsh & McLennan Companies*
> Global Technology Infrastructure (MGTI) | Centralised Operations
> 555 Lonsdale Street, Level 5, Melbourne, VIC 3000, Australia
> +61 3 9623 5488 | Mobile +61 0402 885 057 | _chris.shave at mercer.com_ 
> <mailto:chris.shave at mercer.com>
> _www.mmc.com_ <http://www.mmc.com/>
> Working Hours:
> Mon-Fri: 8:00am-4:00pm AEST
>
>
> ------------------------------------------------------------------------
> *From:* Sunil Mushran [mailto:sunil.mushran at oracle.com]
> *Sent:* Tuesday, 28 June 2011 3:16 AM
> *To:* Shave, Chris
> *Cc:* ocfs2-users at oss.oracle.com
> *Subject:* Re: [Ocfs2-users] Heartbeat stays active & stops o2cb shutdown
>
> So by default, the hb is supposed to stop on umount.
>
> Do:
> # find /sys/kernel/config/cluster/<CLUSTERNAME>/heartbeat/* -type d | 
> xargs basename
> 77D95EF51C0149D2823674FCC162CF8B
>
> This will list the active heartbeats.
>
> For each hb, do:
> # ocfs2_hb_ctl -I -u 77D95EF51C0149D2823674FCC162CF8B
> 77D95EF51C0149D2823674FCC162CF8B: 1 refs
>
> Notice the references. > 0 is active heartbeat.
>
> If you are sure there are no mounts and "ls /dlm" also has not entries,
> then hb failed to stop for some reason.
>
> To stop, do:
> # ocfs2_hb_ctl -K -u 77D95EF51C0149D2823674FCC162CF8B
>
> It could be that this is failing. What do you see?
>
> I remember we had a problem in this in tools 1.4.1. But that was
> fixed in 1.4.2.
>
> Sunil
>
> On 06/25/2011 06:03 PM, Shave, Chris wrote:
>> Hi,
>> I have an issue with shutting down o2cb & offlining the cluster, the 
>> heartbeat is staying active & is blocking any attempts to shut it 
>> down, despite there being zero ocfs2 filesystems mounted.
>> This is what I see, even happens if using force-offline option:
>> [root]# /etc/init.d/o2cb force-offline clustername
>> Stopping O2CB cluster clustername Failed
>> Unable to stop cluster as heartbeat region still active
>> I have no ocfs2 filesystems curently mounted on either node (2 node 
>> cluster)
>> [root]# mount | grep ocfs
>> [root]#
>> Versions of ocfs2 as below:
>> [root]# rpm -qa | grep ocfs
>> ocfs2-tools-1.4.4-1.el5.x86_64
>> ocfs2-tools-devel-1.4.4-1.el5.x86_64
>> ocfs2console-1.4.4-1.el5.x86_64
>> ocfs2-2.6.18-128.el5-1.4.4-1.el5.x86_64
>> ocfs2-tools-debuginfo-1.4.4-1.el5.x86_64
>> Redhat Linux kernel version: 2.6.18-128.el5
>> A collegue of mine stated that he usually disables all ocfs2 from the 
>> startup scripts, comments out the filesystems in /etc/fstab & 
>> reboots, is there another option to get the heartbeat offline or is 
>> this an ocfs2 or Linux bug I am encountering here??
>> Cheers,
>>
>> *Christopher Shave*, Global UNIX/Linux Projects Team
>> *Marsh & McLennan Companies*
>> Global Technology Infrastructure (MGTI) | Centralised Operations
>> 555 Lonsdale Street, Level 5, Melbourne, VIC 3000, Australia
>> +61 3 9623 5488 | Mobile +61 0402 885 057 | _chris.shave at mercer.com_ 
>> <mailto:chris.shave at mercer.com>
>> _www.mmc.com_ <http://www.mmc.com/>
>>
>
>
> ------------------------------------------------------------------------
> This e-mail and any attachments may be confidential or legally privileged.
> If you received this message in error or are not the intended 
> recipient, you
> should destroy the e-mail message and any attachments or copies, and 
> you are
> prohibited from retaining, distributing, disclosing or using any 
> information
> contained herein. Please inform us of the erroneous delivery by return
> e-mail.
>
> Thank you for your cooperation.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110628/447fbce9/attachment.html 


More information about the Ocfs2-users mailing list