[Ocfs2-users] Unable to stop cluster as heartbeat region still active

Sunil Mushran sunil.mushran at oracle.com
Tue Oct 18 14:37:15 PDT 2011


So it is not mounted. But we still have a hb thread because
hb could not be stopped during umount. The reason for that
could be the same that causes ocfs2_hb_ctl to fail.

Do:
mounted.ocfs2 -d

On 10/18/2011 02:32 PM, Laurentiu Gosu wrote:
> ls -lR /sys/kernel/debug/ocfs2
> /sys/kernel/debug/ocfs2:
> total 0
>
> ls -lR /sys/kernel/debug/o2dlm
> /sys/kernel/debug/o2dlm:
> total 0
>
> ocfs2_hb_ctl -I -d /dev/dm-2
> ocfs2_hb_ctl: Device name specified was not found while reading uuid
>
> There is no /dev/dm-2 mounted.
>
>
> On 10/19/2011 00:27, Sunil Mushran wrote:
>> mount -t debugfs debugfs /sys/kernel/debug
>>
>> Then list that dir.
>>
>> Also, do:
>> ocfs2_hb_ctl -l -d /dev/dm-2
>>
>> Be careful before killing. We want to be sure that dev is not mounted.
>>
>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote:
>>> Again   the outputs:
>>>  cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
>>> dm-2
>>> --->here should be volgr1-lvol0 i guess?
>>>
>>> ls -lR /sys/kernel/debug/ocfs2
>>> ls: /sys/kernel/debug/ocfs2: No such file or directory
>>>
>>> ls -lR /sys/kernel/debug/o2dlm
>>> ls: /sys/kernel/debug/o2dlm: No such file or directory
>>>
>>> I think i have to enable debug first somehow..?
>>>
>>> Laurentiu.
>>>
>>> On 10/19/2011 00:17, Sunil Mushran wrote:
>>>> What does this return?
>>>> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
>>>>
>>>> Also, do:
>>>> ls -lR /sys/kernel/debug/ocfs2
>>>> ls -lR /sys/kernel/debug/o2dlm
>>>>
>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote:
>>>>> Here is the output:
>>>>>
>>>>> ls -lR /sys/kernel/config/cluster
>>>>> /sys/kernel/config/cluster:
>>>>> total 0
>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER
>>>>>
>>>>> /sys/kernel/config/cluster/CLUSTER:
>>>>> total 0
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method
>>>>> drwxr-xr-x 3 root root    0 Oct 19 00:12 heartbeat
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms
>>>>> drwxr-xr-x 4 root root    0 Oct 11 20:23 node
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms
>>>>>
>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat:
>>>>> total 0
>>>>> drwxr-xr-x 2 root root    0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold
>>>>>
>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6:
>>>>> total 0
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev
>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block
>>>>>
>>>>> /sys/kernel/config/cluster/CLUSTER/node:
>>>>> total 0
>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001
>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002
>>>>>
>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001:
>>>>> total 0
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
>>>>>
>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002:
>>>>> total 0
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/19/2011 00:12, Sunil Mushran wrote:
>>>>>> ls -lR /sys/kernel/config/cluster
>>>>>>
>>>>>> What does this return?
>>>>>>
>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote:
>>>>>>> Hi,
>>>>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5,
>>>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5.
>>>>>>> My problem is that all the time when i try to run /etc/init.d/o2cb stop
>>>>>>> it fails with this error:
>>>>>>>       Stopping O2CB cluster CLUSTER: Failed
>>>>>>>       Unable to stop cluster as heartbeat region still active
>>>>>>> There is no active mount point. I tried to manually stop the heartdbeat
>>>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding
>>>>>>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 ").
>>>>>>> But even if refs number is set to zero the "heartbeat region still
>>>>>>> active" occurs.
>>>>>>> How can i fix this?
>>>>>>>
>>>>>>> Thank you in advance.
>>>>>>> Laurentiu.
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Ocfs2-users mailing list
>>>>>>> Ocfs2-users at oss.oracle.com
>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>>
>>>>>
>>>>
>>>
>>
>




More information about the Ocfs2-users mailing list