[Ocfs2-users] Unable to stop cluster as heartbeat region still active
Laurentiu Gosu
lg at easic.ro
Sun Dec 11 08:14:10 PST 2011
Hi Sunil,
Maybe you remember the bellow thread. Shortly the pb was that heartbeat
region was still active after umounting the ocfs volume(i use latest UEK
+ ocfs2-tools).
Based on this link
http://markmail.org/message/7h7r32avuitqdhzr#query:+page:1+mid:lq7arecz2dui6b3v+state:results
i manually created /dev/dm-2 symlink to point to my SAN device
[/dev/mapper/volgr1-lvol0] and the hearbeat was stopped normally. Maybe
it helps you find the real issue. As i understand that symlink should be
automatically created but it seems the pb is still there in
ocfs2-tools-1.6.3-2.el5.
br,
laurentiu.
On 10/24/2011 23:54, Sunil Mushran wrote:
> Well, I wouldn't advice you to go into prod with this problem.
> To figure out the issue, we'll need to provide a debug version of
> ocfs2_hb_ctl.
>
> If you have support, ping oracle support and ask for assistance.
>
> If not, download the source and run ocfs2_hb_ctl in gdb. The problem
> is in the code path that begins in the function lookup_dev().
>
> On 10/23/2011 01:30 PM, Laurentiu Gosu wrote:
>> #rpm -qa |grep ocfs2
>> ocfs2console-1.6.3-2.el5
>> ocfs2-tools-1.6.3-2.el5
>>
>> Just let me know if I can give more details to find the problem. I
>> will move ocfs2 into production in the next weeks.
>>
>>
>> On 10/23/2011 22:49, Sunil Mushran wrote:
>>> Are you sure you have ocfs2-tools-1.6.3? I remember we had an
>>> issue with this with an earlier release... 1.6.1/.2.
>>>
>>> On 10/23/2011 10:43 AM, Laurentiu Gosu wrote:
>>>> hmm..
>>>> #ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs
>>>> *BUT:*
>>>> #ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2
>>>> ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat
>>>> I can still kill the ref using device name (-d).
>>>>
>>>> On 10/23/2011 17:57, Sunil Mushran wrote:
>>>>> I think it stops by uuid. So try doing this the next time.
>>>>> You are encountering some issue that we have not seen before.
>>>>> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2
>>>>>
>>>>> On 10/23/2011 05:32 AM, Laurentiu Gosu wrote:
>>>>>> Hi Sunil,
>>>>>> Sorry for my late reply, i just had time today to start from
>>>>>> scratch and test.
>>>>>> I rebuilt my environment(2 nodes connected to a SAN via
>>>>>> iSCSI+multipath). I still have the issue that the heartbeat is
>>>>>> active after I umount my ocfs2 volume.
>>>>>> /etc/init.d/o2cb stop
>>>>>> Stopping O2CB cluster CLUST: Failed
>>>>>> Unable to stop cluster as heartbeat region still active
>>>>>>
>>>>>> ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0
>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs
>>>>>>
>>>>>> After i manually kill the ref (ocfs2_hb_ctl -K -d
>>>>>> /dev/mapper/volgr1-lvol0 ocfs2 ) i can stop successfully o2cb. I
>>>>>> can live with that but why doesn't it stop automatically? As i
>>>>>> understand, hearbeat should be started and stopped once the
>>>>>> volume gets mounted/umounted.
>>>>>>
>>>>>> br,
>>>>>> Laurentiu.
>>>>>>
>>>>>> On 10/19/2011 02:28, Sunil Mushran wrote:
>>>>>>> Manual delete will only work if there are no references. In your
>>>>>>> case
>>>>>>> there are references.
>>>>>>>
>>>>>>> You may want to start both nodes from scratch. Do not start/stop
>>>>>>> heartbeat manually. Also, do not force-format.
>>>>>>>
>>>>>>> On 10/18/2011 03:54 PM, Laurentiu Gosu wrote:
>>>>>>>> OK, i rebooted one of the nodes(both had similar issues); . But
>>>>>>>> something is still fishy.
>>>>>>>> - i mounted the device: mount -t ocfs2 /dev/volgr1/lvol0 /mnt/tmp/
>>>>>>>> - i unmount it: umount /mnt/tmp/
>>>>>>>> - tried to stop o2cb: /etc/init.d/o2cb stop
>>>>>>>> Stopping O2CB cluster CLUSTER: Failed
>>>>>>>> Unable to stop cluster as heartbeat region still active
>>>>>>>> - ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs
>>>>>>>> - ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>> ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping
>>>>>>>> heartbeat
>>>>>>>> - ls -Rl /sys/kernel/config/cluster/CLUSTER/heartbeat/
>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/:
>>>>>>>> total 0
>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 01:50
>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:40 dead_threshold
>>>>>>>>
>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/0C4AB55FE9314FA5A9F81652FDB9B22D:
>>>>>>>> total 0
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 block_bytes
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 blocks
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 dev
>>>>>>>> -r--r--r-- 1 root root 4096 Oct 19 01:50 pid
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 start_block
>>>>>>>>
>>>>>>>> - i cannot manually delete
>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/0C4AB55FE9314FA5A9F81652FDB9B22D/
>>>>>>>>
>>>>>>>> PS: i'm going to sleep now, i have to be up in a few hours. We
>>>>>>>> can continue tomorrow if it's ok with you.
>>>>>>>> Thank you for your help.
>>>>>>>>
>>>>>>>> Laurentiu.
>>>>>>>>
>>>>>>>> On 10/19/2011 01:33, Sunil Mushran wrote:
>>>>>>>>> One way this can happen is if one starts the hb manually and
>>>>>>>>> then force
>>>>>>>>> formats on that volume. The format will generate a new uuid.
>>>>>>>>> Once that
>>>>>>>>> happens, the hb tool cannot map the region to the device and
>>>>>>>>> thus fail
>>>>>>>>> to stop it. Right now the easiest option on this box is
>>>>>>>>> resetting it.
>>>>>>>>>
>>>>>>>>> On 10/18/2011 03:24 PM, Laurentiu Gosu wrote:
>>>>>>>>>> Yes, i did reformat it(even more than once i think, last
>>>>>>>>>> week). This is a pre-production system and i'm trying various
>>>>>>>>>> options before moving into real life.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/19/2011 01:19, Sunil Mushran wrote:
>>>>>>>>>>> Did you reformat the volume recently? or, when did you
>>>>>>>>>>> format last?
>>>>>>>>>>>
>>>>>>>>>>> On 10/18/2011 03:13 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>> well..this is weird
>>>>>>>>>>>> ls /sys/kernel/config/cluster/CLUSTER/heartbeat/
>>>>>>>>>>>> *918673F06F8F4ED188DDCE14F39945F6* dead_threshold
>>>>>>>>>>>>
>>>>>>>>>>>> looks like we have different UUIDs. Where is this coming from??
>>>>>>>>>>>>
>>>>>>>>>>>> ocfs2_hb_ctl -I -u 918673F06F8F4ED188DDCE14F39945F6
>>>>>>>>>>>> 918673F06F8F4ED188DDCE14F39945F6: 1 refs
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/19/2011 01:04, Sunil Mushran wrote:
>>>>>>>>>>>>> Let's do it by hand.
>>>>>>>>>>>>> rm -rf
>>>>>>>>>>>>> /sys/kernel/config/cluster/.../heartbeat/*0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>> *
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/18/2011 02:52 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>> ocfs2_hb_ctl: File not found by ocfs2_lookup while
>>>>>>>>>>>>>> stopping heartbeat
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No improvment :(
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 10/19/2011 00:50, Sunil Mushran wrote:
>>>>>>>>>>>>>>> See if this cleans it up.
>>>>>>>>>>>>>>> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10/18/2011 02:44 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>> ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 10/19/2011 00:43, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>> ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 10/18/2011 02:40 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>> mounted.ocfs2 -d
>>>>>>>>>>>>>>>>>> Device FS Stack
>>>>>>>>>>>>>>>>>> UUID Label
>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0 ocfs2 o2cb
>>>>>>>>>>>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> mounted.ocfs2 -f
>>>>>>>>>>>>>>>>>> Device FS Nodes
>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ro02xsrv001 = the other node in the cluster.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> By the way, there is no /dev/md-2
>>>>>>>>>>>>>>>>>> ls /dev/dm-*
>>>>>>>>>>>>>>>>>> /dev/dm-0 /dev/dm-1
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 10/19/2011 00:37, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>> So it is not mounted. But we still have a hb thread
>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>> hb could not be stopped during umount. The reason
>>>>>>>>>>>>>>>>>>> for that
>>>>>>>>>>>>>>>>>>> could be the same that causes ocfs2_hb_ctl to fail.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Do:
>>>>>>>>>>>>>>>>>>> mounted.ocfs2 -d
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:32 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/ocfs2
>>>>>>>>>>>>>>>>>>>> /sys/kernel/debug/ocfs2:
>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/o2dlm
>>>>>>>>>>>>>>>>>>>> /sys/kernel/debug/o2dlm:
>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ocfs2_hb_ctl -I -d /dev/dm-2
>>>>>>>>>>>>>>>>>>>> ocfs2_hb_ctl: Device name specified was not found
>>>>>>>>>>>>>>>>>>>> while reading uuid
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> There is no /dev/dm-2 mounted.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 10/19/2011 00:27, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>>>> mount -t debugfs debugfs /sys/kernel/debug
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Then list that dir.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Also, do:
>>>>>>>>>>>>>>>>>>>>> ocfs2_hb_ctl -l -d /dev/dm-2
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Be careful before killing. We want to be sure that
>>>>>>>>>>>>>>>>>>>>> dev is not mounted.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>>>> Again the outputs:
>>>>>>>>>>>>>>>>>>>>>> cat
>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
>>>>>>>>>>>>>>>>>>>>>> dm-2
>>>>>>>>>>>>>>>>>>>>>> --->here should be volgr1-lvol0 i guess?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/ocfs2
>>>>>>>>>>>>>>>>>>>>>> ls: /sys/kernel/debug/ocfs2: No such file or
>>>>>>>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/o2dlm
>>>>>>>>>>>>>>>>>>>>>> ls: /sys/kernel/debug/o2dlm: No such file or
>>>>>>>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I think i have to enable debug first somehow..?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Laurentiu.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 10/19/2011 00:17, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>>>>>> What does this return?
>>>>>>>>>>>>>>>>>>>>>>> cat
>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Also, do:
>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/ocfs2
>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/o2dlm
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Here is the output:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/config/cluster
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> fence_method
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> idle_timeout_ms
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> keepalive_delay_ms
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> reconnect_delay_ms
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> 918673F06F8F4ED188DDCE14F39945F6
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> dead_threshold
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/*918673F06F8F4ED188DDCE14F39945F6*:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> block_bytes
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev
>>>>>>>>>>>>>>>>>>>>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> start_block
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> ipv4_address
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12
>>>>>>>>>>>>>>>>>>>>>>>> ipv4_address
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On 10/19/2011 00:12, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/config/cluster
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> What does this return?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>> I have a 2 nodes ocfs2 cluster running UEK
>>>>>>>>>>>>>>>>>>>>>>>>>> 2.6.32-100.0.19.el5,
>>>>>>>>>>>>>>>>>>>>>>>>>> ocfs2console-1.6.3-2.el5,
>>>>>>>>>>>>>>>>>>>>>>>>>> ocfs2-tools-1.6.3-2.el5.
>>>>>>>>>>>>>>>>>>>>>>>>>> My problem is that all the time when i try to
>>>>>>>>>>>>>>>>>>>>>>>>>> run /etc/init.d/o2cb stop
>>>>>>>>>>>>>>>>>>>>>>>>>> it fails with this error:
>>>>>>>>>>>>>>>>>>>>>>>>>> Stopping O2CB cluster CLUSTER: Failed
>>>>>>>>>>>>>>>>>>>>>>>>>> Unable to stop cluster as heartbeat
>>>>>>>>>>>>>>>>>>>>>>>>>> region still active
>>>>>>>>>>>>>>>>>>>>>>>>>> There is no active mount point. I tried to
>>>>>>>>>>>>>>>>>>>>>>>>>> manually stop the heartdbeat
>>>>>>>>>>>>>>>>>>>>>>>>>> with "ocfs2_hb_ctl -K -d
>>>>>>>>>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0 ocfs2" (after finding
>>>>>>>>>>>>>>>>>>>>>>>>>> the refs number with "ocfs2_hb_ctl -I -d
>>>>>>>>>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0 ").
>>>>>>>>>>>>>>>>>>>>>>>>>> But even if refs number is set to zero the
>>>>>>>>>>>>>>>>>>>>>>>>>> "heartbeat region still
>>>>>>>>>>>>>>>>>>>>>>>>>> active" occurs.
>>>>>>>>>>>>>>>>>>>>>>>>>> How can i fix this?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you in advance.
>>>>>>>>>>>>>>>>>>>>>>>>>> Laurentiu.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>>>> Ocfs2-users mailing list
>>>>>>>>>>>>>>>>>>>>>>>>>> Ocfs2-users at oss.oracle.com
>>>>>>>>>>>>>>>>>>>>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20111211/62806b11/attachment-0001.html
More information about the Ocfs2-users
mailing list