[Ocfs2-users] Unable to stop cluster as heartbeat region still active

Laurentiu Gosu lg at easic.ro
Sun Dec 11 08:14:10 PST 2011


Hi Sunil,
Maybe you remember the bellow thread. Shortly the pb was that heartbeat 
region was still active after umounting the ocfs volume(i use latest UEK 
+ ocfs2-tools).
Based on this link 
http://markmail.org/message/7h7r32avuitqdhzr#query:+page:1+mid:lq7arecz2dui6b3v+state:results 
i manually created /dev/dm-2 symlink to point to my SAN device 
[/dev/mapper/volgr1-lvol0] and the hearbeat was stopped normally.  Maybe 
it helps you find the real issue. As i understand that symlink should be 
automatically created but it seems the pb is still there in 
ocfs2-tools-1.6.3-2.el5.

br,
laurentiu.

On 10/24/2011 23:54, Sunil Mushran wrote:
> Well, I wouldn't advice you to go into prod with this problem.
> To figure out the issue, we'll need to provide a debug version of
> ocfs2_hb_ctl.
>
> If you have support, ping oracle support and ask for assistance.
>
> If not, download the source and run ocfs2_hb_ctl in gdb. The problem
> is in the code path that begins in the function lookup_dev().
>
> On 10/23/2011 01:30 PM, Laurentiu Gosu wrote:
>> #rpm -qa |grep ocfs2
>> ocfs2console-1.6.3-2.el5
>> ocfs2-tools-1.6.3-2.el5
>>
>> Just let me know if I can give more details to find the problem. I 
>> will move ocfs2 into production in the next weeks.
>>
>>
>> On 10/23/2011 22:49, Sunil Mushran wrote:
>>> Are you sure you have ocfs2-tools-1.6.3? I remember we had an
>>> issue with this with an earlier release... 1.6.1/.2.
>>>
>>> On 10/23/2011 10:43 AM, Laurentiu Gosu wrote:
>>>> hmm..
>>>> #ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs
>>>> *BUT:*
>>>> #ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2
>>>> ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat
>>>> I can still kill the ref using device name (-d).
>>>>
>>>> On 10/23/2011 17:57, Sunil Mushran wrote:
>>>>> I think it stops by uuid. So try doing this the next time.
>>>>> You are encountering some issue that we have not seen before.
>>>>> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2
>>>>>
>>>>> On 10/23/2011 05:32 AM, Laurentiu Gosu wrote:
>>>>>> Hi Sunil,
>>>>>> Sorry for my late reply, i just had time today to start from 
>>>>>> scratch and test.
>>>>>> I rebuilt my environment(2 nodes connected to a SAN via 
>>>>>> iSCSI+multipath). I still have the issue that the heartbeat is 
>>>>>> active after I umount my ocfs2 volume.
>>>>>> /etc/init.d/o2cb stop
>>>>>> Stopping O2CB cluster CLUST: Failed
>>>>>> Unable to stop cluster as heartbeat region still active
>>>>>>
>>>>>> ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0
>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs
>>>>>>
>>>>>> After i manually kill the ref (ocfs2_hb_ctl -K -d 
>>>>>> /dev/mapper/volgr1-lvol0 ocfs2 ) i can stop successfully o2cb. I 
>>>>>> can live with that but why doesn't it stop automatically? As i 
>>>>>> understand, hearbeat should be started and stopped once the 
>>>>>> volume gets mounted/umounted.
>>>>>>
>>>>>> br,
>>>>>> Laurentiu.
>>>>>>
>>>>>> On 10/19/2011 02:28, Sunil Mushran wrote:
>>>>>>> Manual delete will only work if there are no references. In your 
>>>>>>> case
>>>>>>> there are references.
>>>>>>>
>>>>>>> You may want to start both nodes from scratch. Do not start/stop
>>>>>>> heartbeat manually. Also, do not force-format.
>>>>>>>
>>>>>>> On 10/18/2011 03:54 PM, Laurentiu Gosu wrote:
>>>>>>>> OK, i rebooted one of the nodes(both had similar issues); . But 
>>>>>>>> something is still fishy.
>>>>>>>> - i mounted the device: mount -t ocfs2 /dev/volgr1/lvol0 /mnt/tmp/
>>>>>>>> - i unmount it: umount /mnt/tmp/
>>>>>>>> - tried to stop o2cb:  /etc/init.d/o2cb stop
>>>>>>>> Stopping O2CB cluster CLUSTER: Failed
>>>>>>>> Unable to stop cluster as heartbeat region still active
>>>>>>>> - ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs
>>>>>>>> -  ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>> ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping 
>>>>>>>> heartbeat
>>>>>>>> - ls -Rl /sys/kernel/config/cluster/CLUSTER/heartbeat/
>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/:
>>>>>>>> total 0
>>>>>>>> drwxr-xr-x 2 root root    0 Oct 19 01:50 
>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:40 dead_threshold
>>>>>>>>
>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/0C4AB55FE9314FA5A9F81652FDB9B22D:
>>>>>>>> total 0
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 block_bytes
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 blocks
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 dev
>>>>>>>> -r--r--r-- 1 root root 4096 Oct 19 01:50 pid
>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 01:50 start_block
>>>>>>>>
>>>>>>>> - i cannot manually delete 
>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/0C4AB55FE9314FA5A9F81652FDB9B22D/
>>>>>>>>
>>>>>>>> PS: i'm going to sleep now, i have to be up in a few hours. We 
>>>>>>>> can continue tomorrow if it's ok with you.
>>>>>>>> Thank you for your help.
>>>>>>>>
>>>>>>>> Laurentiu.
>>>>>>>>
>>>>>>>> On 10/19/2011 01:33, Sunil Mushran wrote:
>>>>>>>>> One way this can happen is if one starts the hb manually and 
>>>>>>>>> then force
>>>>>>>>> formats on that volume. The format will generate a new uuid. 
>>>>>>>>> Once that
>>>>>>>>> happens, the hb tool cannot map the region to the device and 
>>>>>>>>> thus fail
>>>>>>>>> to stop it. Right now the easiest option on this box is 
>>>>>>>>> resetting it.
>>>>>>>>>
>>>>>>>>> On 10/18/2011 03:24 PM, Laurentiu Gosu wrote:
>>>>>>>>>> Yes, i did reformat it(even more than once i think, last 
>>>>>>>>>> week). This is a pre-production system and i'm trying various 
>>>>>>>>>> options before moving into real life.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/19/2011 01:19, Sunil Mushran wrote:
>>>>>>>>>>> Did you reformat the volume recently? or, when did you 
>>>>>>>>>>> format last?
>>>>>>>>>>>
>>>>>>>>>>> On 10/18/2011 03:13 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>> well..this is weird
>>>>>>>>>>>> ls /sys/kernel/config/cluster/CLUSTER/heartbeat/
>>>>>>>>>>>> *918673F06F8F4ED188DDCE14F39945F6*  dead_threshold
>>>>>>>>>>>>
>>>>>>>>>>>> looks like we have different UUIDs. Where is this coming from??
>>>>>>>>>>>>
>>>>>>>>>>>> ocfs2_hb_ctl -I -u 918673F06F8F4ED188DDCE14F39945F6
>>>>>>>>>>>> 918673F06F8F4ED188DDCE14F39945F6: 1 refs
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/19/2011 01:04, Sunil Mushran wrote:
>>>>>>>>>>>>> Let's do it by hand.
>>>>>>>>>>>>> rm -rf 
>>>>>>>>>>>>> /sys/kernel/config/cluster/.../heartbeat/*0C4AB55FE9314FA5A9F81652FDB9B22D 
>>>>>>>>>>>>> *
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/18/2011 02:52 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>  ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>> ocfs2_hb_ctl: File not found by ocfs2_lookup while 
>>>>>>>>>>>>>> stopping heartbeat
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No improvment :(
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 10/19/2011 00:50, Sunil Mushran wrote:
>>>>>>>>>>>>>>> See if this cleans it up.
>>>>>>>>>>>>>>> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10/18/2011 02:44 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>> ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 10/19/2011 00:43, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>> ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 10/18/2011 02:40 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>> mounted.ocfs2 -d
>>>>>>>>>>>>>>>>>> Device                FS     Stack  
>>>>>>>>>>>>>>>>>> UUID                              Label
>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0  ocfs2  o2cb   
>>>>>>>>>>>>>>>>>> 0C4AB55FE9314FA5A9F81652FDB9B22D  ocfs2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> mounted.ocfs2 -f
>>>>>>>>>>>>>>>>>> Device                FS     Nodes
>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0  ocfs2  ro02xsrv001
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ro02xsrv001 = the other node in the cluster.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> By the way, there is no /dev/md-2
>>>>>>>>>>>>>>>>>>  ls /dev/dm-*
>>>>>>>>>>>>>>>>>> /dev/dm-0  /dev/dm-1
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 10/19/2011 00:37, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>> So it is not mounted. But we still have a hb thread 
>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>> hb could not be stopped during umount. The reason 
>>>>>>>>>>>>>>>>>>> for that
>>>>>>>>>>>>>>>>>>> could be the same that causes ocfs2_hb_ctl to fail.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Do:
>>>>>>>>>>>>>>>>>>> mounted.ocfs2 -d
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:32 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/ocfs2
>>>>>>>>>>>>>>>>>>>> /sys/kernel/debug/ocfs2:
>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/o2dlm
>>>>>>>>>>>>>>>>>>>> /sys/kernel/debug/o2dlm:
>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ocfs2_hb_ctl -I -d /dev/dm-2
>>>>>>>>>>>>>>>>>>>> ocfs2_hb_ctl: Device name specified was not found 
>>>>>>>>>>>>>>>>>>>> while reading uuid
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> There is no /dev/dm-2 mounted.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 10/19/2011 00:27, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>>>> mount -t debugfs debugfs /sys/kernel/debug
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Then list that dir.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Also, do:
>>>>>>>>>>>>>>>>>>>>> ocfs2_hb_ctl -l -d /dev/dm-2
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Be careful before killing. We want to be sure that 
>>>>>>>>>>>>>>>>>>>>> dev is not mounted.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>>>> Again   the outputs:
>>>>>>>>>>>>>>>>>>>>>>  cat 
>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
>>>>>>>>>>>>>>>>>>>>>> dm-2
>>>>>>>>>>>>>>>>>>>>>> --->here should be volgr1-lvol0 i guess?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/ocfs2
>>>>>>>>>>>>>>>>>>>>>> ls: /sys/kernel/debug/ocfs2: No such file or 
>>>>>>>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/o2dlm
>>>>>>>>>>>>>>>>>>>>>> ls: /sys/kernel/debug/o2dlm: No such file or 
>>>>>>>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I think i have to enable debug first somehow..?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Laurentiu.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 10/19/2011 00:17, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>>>>>> What does this return?
>>>>>>>>>>>>>>>>>>>>>>> cat 
>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Also, do:
>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/ocfs2
>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/debug/o2dlm
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Here is the output:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/config/cluster
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> fence_method
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 3 root root    0 Oct 19 00:12 heartbeat
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> idle_timeout_ms
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> keepalive_delay_ms
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 4 root root    0 Oct 11 20:23 node
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> reconnect_delay_ms
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 2 root root    0 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> 918673F06F8F4ED188DDCE14F39945F6
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> dead_threshold
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/*918673F06F8F4ED188DDCE14F39945F6*: 
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> block_bytes
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev
>>>>>>>>>>>>>>>>>>>>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> start_block
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node:
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001
>>>>>>>>>>>>>>>>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: 
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> ipv4_address
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: 
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> total 0
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 
>>>>>>>>>>>>>>>>>>>>>>>> ipv4_address
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
>>>>>>>>>>>>>>>>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On 10/19/2011 00:12, Sunil Mushran wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> ls -lR /sys/kernel/config/cluster
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> What does this return?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>> I have a 2 nodes ocfs2 cluster running UEK 
>>>>>>>>>>>>>>>>>>>>>>>>>> 2.6.32-100.0.19.el5,
>>>>>>>>>>>>>>>>>>>>>>>>>> ocfs2console-1.6.3-2.el5, 
>>>>>>>>>>>>>>>>>>>>>>>>>> ocfs2-tools-1.6.3-2.el5.
>>>>>>>>>>>>>>>>>>>>>>>>>> My problem is that all the time when i try to 
>>>>>>>>>>>>>>>>>>>>>>>>>> run /etc/init.d/o2cb stop
>>>>>>>>>>>>>>>>>>>>>>>>>> it fails with this error:
>>>>>>>>>>>>>>>>>>>>>>>>>>       Stopping O2CB cluster CLUSTER: Failed
>>>>>>>>>>>>>>>>>>>>>>>>>>       Unable to stop cluster as heartbeat 
>>>>>>>>>>>>>>>>>>>>>>>>>> region still active
>>>>>>>>>>>>>>>>>>>>>>>>>> There is no active mount point. I tried to 
>>>>>>>>>>>>>>>>>>>>>>>>>> manually stop the heartdbeat
>>>>>>>>>>>>>>>>>>>>>>>>>> with "ocfs2_hb_ctl -K -d 
>>>>>>>>>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0 ocfs2" (after finding
>>>>>>>>>>>>>>>>>>>>>>>>>> the refs number with "ocfs2_hb_ctl -I -d 
>>>>>>>>>>>>>>>>>>>>>>>>>> /dev/mapper/volgr1-lvol0 ").
>>>>>>>>>>>>>>>>>>>>>>>>>> But even if refs number is set to zero the 
>>>>>>>>>>>>>>>>>>>>>>>>>> "heartbeat region still
>>>>>>>>>>>>>>>>>>>>>>>>>> active" occurs.
>>>>>>>>>>>>>>>>>>>>>>>>>> How can i fix this?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you in advance.
>>>>>>>>>>>>>>>>>>>>>>>>>> Laurentiu.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>>>> Ocfs2-users mailing list
>>>>>>>>>>>>>>>>>>>>>>>>>> Ocfs2-users at oss.oracle.com
>>>>>>>>>>>>>>>>>>>>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users 
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20111211/62806b11/attachment-0001.html 


More information about the Ocfs2-users mailing list