[Ocfs2-users] Heartbeat stays active & stops o2cb shutdown

Shave, Chris Chris.Shave at mercer.com
Tue Jun 28 18:41:42 PDT 2011


Thanks for the info on how to shut the heartbeat down..

I haven't had a chance to test it as of yet.

I had previously found info on /dlm being mounted, I unmounted it on both nodes & heartbeat still stayed active.

Also post a reboot, the process of offlining the cluster did not return an error anymore but a status check still indicated heartbeat was active & an attempt to unload then threw back the heartbeat active error.



Christopher Shave, Global UNIX/Linux Projects Team
Marsh & McLennan Companies
Global Technology Infrastructure (MGTI) | Centralised Operations
555 Lonsdale Street, Level 5, Melbourne, VIC 3000, Australia
+61 3 9623 5488 | Mobile +61 0402 885 057 | chris.shave at mercer.com<mailto:chris.shave at mercer.com>
www.mmc.com<http://www.mmc.com/>
Working Hours:
Mon-Fri: 8:00am-4:00pm AEST



________________________________
From: Sunil Mushran [mailto:sunil.mushran at oracle.com]
Sent: Tuesday, 28 June 2011 3:16 AM
To: Shave, Chris
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] Heartbeat stays active & stops o2cb shutdown

So by default, the hb is supposed to stop on umount.

Do:
# find /sys/kernel/config/cluster/<CLUSTERNAME>/heartbeat/* -type d | xargs basename
77D95EF51C0149D2823674FCC162CF8B

This will list the active heartbeats.

For each hb, do:
# ocfs2_hb_ctl -I -u 77D95EF51C0149D2823674FCC162CF8B
77D95EF51C0149D2823674FCC162CF8B: 1 refs

Notice the references. > 0 is active heartbeat.

If you are sure there are no mounts and "ls /dlm" also has not entries,
then hb failed to stop for some reason.

To stop, do:
# ocfs2_hb_ctl -K -u 77D95EF51C0149D2823674FCC162CF8B

It could be that this is failing. What do you see?

I remember we had a problem in this in tools 1.4.1. But that was
fixed in 1.4.2.

Sunil

On 06/25/2011 06:03 PM, Shave, Chris wrote:
Hi,

I have an issue with shutting down o2cb & offlining the cluster, the heartbeat is staying active & is blocking any attempts to shut it down, despite there being zero ocfs2 filesystems mounted.

This is what I see, even happens if using force-offline option:

[root]# /etc/init.d/o2cb force-offline clustername
Stopping O2CB cluster clustername Failed
Unable to stop cluster as heartbeat region still active

I have no ocfs2 filesystems curently mounted on either node (2 node cluster)

[root]# mount | grep ocfs
[root]#
Versions of ocfs2 as below:

[root]# rpm -qa | grep ocfs
ocfs2-tools-1.4.4-1.el5.x86_64
ocfs2-tools-devel-1.4.4-1.el5.x86_64
ocfs2console-1.4.4-1.el5.x86_64
ocfs2-2.6.18-128.el5-1.4.4-1.el5.x86_64
ocfs2-tools-debuginfo-1.4.4-1.el5.x86_64
Redhat Linux kernel version: 2.6.18-128.el5

A collegue of mine stated that he usually disables all ocfs2 from the startup scripts, comments out the filesystems in /etc/fstab & reboots, is there another option to get the heartbeat offline or is this an ocfs2 or Linux bug I am encountering here??


Cheers,


Christopher Shave, Global UNIX/Linux Projects Team
Marsh & McLennan Companies
Global Technology Infrastructure (MGTI) | Centralised Operations
555 Lonsdale Street, Level 5, Melbourne, VIC 3000, Australia
+61 3 9623 5488 | Mobile +61 0402 885 057 | chris.shave at mercer.com<mailto:chris.shave at mercer.com>
www.mmc.com<http://www.mmc.com/>


________________________________
This e-mail and any attachments may be confidential or legally privileged.
If you received this message in error or are not the intended recipient, you
should destroy the e-mail message and any attachments or copies, and you are
prohibited from retaining, distributing, disclosing or using any information
contained herein. Please inform us of the erroneous delivery by return
e-mail.

Thank you for your cooperation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110629/1d69888b/attachment-0001.html 


More information about the Ocfs2-users mailing list