<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<pre style="color: rgb(0, 0, 0); font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">I accidentally re-formated the volume.
Is there any way to get rid of this problem without rebooting:
# mounted.ocfs2 -d
Device FS Stack UUID Label
/dev/sdb ocfs2 o2cb 12963EAF4E16484DB81ECB0251177C26 ocfs2_drbd1
/dev/drbd1 ocfs2 o2cb 12963EAF4E16484DB81ECB0251177C26 ocfs2_drbd1
# ls -l /sys/kernel/config/cluster/cpc/heartbeat/
drwxr-xr-x 2 root root 0 Dec 24 22:53 72EF09EA3D0D4F51BDC00B47432B1EB2
# ocfs2_hb_ctl -I -u 72EF09EA3D0D4F51BDC00B47432B1EB2
72EF09EA3D0D4F51BDC00B47432B1EB2: 7 refs
# ocfs2_hb_ctl -K -u 72EF09EA3D0D4F51BDC00B47432B1EB2
ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat
On 10/19/2011 01:33, Sunil Mushran wrote:
><i> One way this can happen is if one starts the hb manually and then force
</i>><i> formats on that volume. The format will generate a new uuid. Once that
</i>><i> happens, the hb tool cannot map the region to the device and thus fail
</i>><i> to stop it. Right now the easiest option on this box is resetting it.
</i>><i>
</i>><i> On 10/18/2011 03:24 PM, Laurentiu Gosu wrote:
</i>>><i> Yes, i did reformat it(even more than once i think, last week). This
</i>>><i> is a pre-production system and i'm trying various options before
</i>>><i> moving into real life.
</i>>><i>
</i>>><i>
</i>>><i> On 10/19/2011 01:19, Sunil Mushran wrote:
</i>>>><i> Did you reformat the volume recently? or, when did you format last?
</i>>>><i>
</i>>>><i> On 10/18/2011 03:13 PM, Laurentiu Gosu wrote:
</i>>>>><i> well..this is weird
</i>>>>><i> ls /sys/kernel/config/cluster/CLUSTER/heartbeat/
</i>>>>><i> *918673F06F8F4ED188DDCE14F39945F6* dead_threshold
</i>>>>><i>
</i>>>>><i> looks like we have different UUIDs. Where is this coming from??
</i>>>>><i>
</i>>>>><i> ocfs2_hb_ctl -I -u 918673F06F8F4ED188DDCE14F39945F6
</i>>>>><i> 918673F06F8F4ED188DDCE14F39945F6: 1 refs
</i>>>>><i>
</i>>>>><i>
</i>>>>><i> On 10/19/2011 01:04, Sunil Mushran wrote:
</i>>>>>><i> Let's do it by hand.
</i>>>>>><i> rm -rf
</i>>>>>><i> /sys/kernel/config/cluster/.../heartbeat/*0C4AB55FE9314FA5A9F81652FDB9B22D
</i>>>>>><i> *
</i>>>>>><i>
</i>>>>>><i> On 10/18/2011 02:52 PM, Laurentiu Gosu wrote:
</i>>>>>>><i> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
</i>>>>>>><i> ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping
</i>>>>>>><i> heartbeat
</i>>>>>>><i>
</i>>>>>>><i> No improvment :(
</i>>>>>>><i>
</i>>>>>>><i>
</i>>>>>>><i> On 10/19/2011 00:50, Sunil Mushran wrote:
</i>>>>>>>><i> See if this cleans it up.
</i>>>>>>>><i> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
</i>>>>>>>><i>
</i>>>>>>>><i> On 10/18/2011 02:44 PM, Laurentiu Gosu wrote:
</i>>>>>>>>><i> ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
</i>>>>>>>>><i> 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs
</i>>>>>>>>><i>
</i>>>>>>>>><i>
</i>>>>>>>>><i> On 10/19/2011 00:43, Sunil Mushran wrote:
</i>>>>>>>>>><i> ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D
</i>>>>>>>>>><i>
</i>>>>>>>>>><i> On 10/18/2011 02:40 PM, Laurentiu Gosu wrote:
</i>>>>>>>>>>><i> mounted.ocfs2 -d
</i>>>>>>>>>>><i> Device FS Stack
</i>>>>>>>>>>><i> UUID Label
</i>>>>>>>>>>><i> /dev/mapper/volgr1-lvol0 ocfs2 o2cb
</i>>>>>>>>>>><i> 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2
</i>>>>>>>>>>><i>
</i>>>>>>>>>>><i> mounted.ocfs2 -f
</i>>>>>>>>>>><i> Device FS Nodes
</i>>>>>>>>>>><i> /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001
</i>>>>>>>>>>><i>
</i>>>>>>>>>>><i> ro02xsrv001 = the other node in the cluster.
</i>>>>>>>>>>><i>
</i>>>>>>>>>>><i> By the way, there is no /dev/md-2
</i>>>>>>>>>>><i> ls /dev/dm-*
</i>>>>>>>>>>><i> /dev/dm-0 /dev/dm-1
</i>>>>>>>>>>><i>
</i>>>>>>>>>>><i>
</i>>>>>>>>>>><i> On 10/19/2011 00:37, Sunil Mushran wrote:
</i>>>>>>>>>>>><i> So it is not mounted. But we still have a hb thread because
</i>>>>>>>>>>>><i> hb could not be stopped during umount. The reason for that
</i>>>>>>>>>>>><i> could be the same that causes ocfs2_hb_ctl to fail.
</i>>>>>>>>>>>><i>
</i>>>>>>>>>>>><i> Do:
</i>>>>>>>>>>>><i> mounted.ocfs2 -d
</i>>>>>>>>>>>><i>
</i>>>>>>>>>>>><i> On 10/18/2011 02:32 PM, Laurentiu Gosu wrote:
</i>>>>>>>>>>>>><i> ls -lR /sys/kernel/debug/ocfs2
</i>>>>>>>>>>>>><i> /sys/kernel/debug/ocfs2:
</i>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>><i> ls -lR /sys/kernel/debug/o2dlm
</i>>>>>>>>>>>>><i> /sys/kernel/debug/o2dlm:
</i>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>><i> ocfs2_hb_ctl -I -d /dev/dm-2
</i>>>>>>>>>>>>><i> ocfs2_hb_ctl: Device name specified was not found while
</i>>>>>>>>>>>>><i> reading uuid
</i>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>><i> There is no /dev/dm-2 mounted.
</i>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>><i> On 10/19/2011 00:27, Sunil Mushran wrote:
</i>>>>>>>>>>>>>><i> mount -t debugfs debugfs /sys/kernel/debug
</i>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>><i> Then list that dir.
</i>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>><i> Also, do:
</i>>>>>>>>>>>>>><i> ocfs2_hb_ctl -l -d /dev/dm-2
</i>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>><i> Be careful before killing. We want to be sure that dev is
</i>>>>>>>>>>>>>><i> not mounted.
</i>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>><i> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote:
</i>>>>>>>>>>>>>>><i> Again the outputs:
</i>>>>>>>>>>>>>>><i> cat
</i>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
</i>>>>>>>>>>>>>>><i> dm-2
</i>>>>>>>>>>>>>>><i> --->here should be volgr1-lvol0 i guess?
</i>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>><i> ls -lR /sys/kernel/debug/ocfs2
</i>>>>>>>>>>>>>>><i> ls: /sys/kernel/debug/ocfs2: No such file or directory
</i>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>><i> ls -lR /sys/kernel/debug/o2dlm
</i>>>>>>>>>>>>>>><i> ls: /sys/kernel/debug/o2dlm: No such file or directory
</i>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>><i> I think i have to enable debug first somehow..?
</i>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>><i> Laurentiu.
</i>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>><i> On 10/19/2011 00:17, Sunil Mushran wrote:
</i>>>>>>>>>>>>>>>><i> What does this return?
</i>>>>>>>>>>>>>>>><i> cat
</i>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
</i>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>><i> Also, do:
</i>>>>>>>>>>>>>>>><i> ls -lR /sys/kernel/debug/ocfs2
</i>>>>>>>>>>>>>>>><i> ls -lR /sys/kernel/debug/o2dlm
</i>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>><i> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote:
</i>>>>>>>>>>>>>>>>><i> Here is the output:
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> ls -lR /sys/kernel/config/cluster
</i>>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster:
</i>>>>>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>>>>>><i> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER:
</i>>>>>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method
</i>>>>>>>>>>>>>>>>><i> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12
</i>>>>>>>>>>>>>>>>><i> keepalive_delay_ms
</i>>>>>>>>>>>>>>>>><i> drwxr-xr-x 4 root root 0 Oct 11 20:23 node
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12
</i>>>>>>>>>>>>>>>>><i> reconnect_delay_ms
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER/heartbeat:
</i>>>>>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>>>>>><i> drwxr-xr-x 2 root root 0 Oct 19 00:12
</i>>>>>>>>>>>>>>>>><i> 918673F06F8F4ED188DDCE14F39945F6
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER/heartbeat/*918673F06F8F4ED188DDCE14F39945F6*:
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev
</i>>>>>>>>>>>>>>>>><i> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER/node:
</i>>>>>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>>>>>><i> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001
</i>>>>>>>>>>>>>>>>><i> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001:
</i>>>>>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002:
</i>>>>>>>>>>>>>>>>><i> total 0
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local
</i>>>>>>>>>>>>>>>>><i> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i> On 10/19/2011 00:12, Sunil Mushran wrote:
</i>>>>>>>>>>>>>>>>>><i> ls -lR /sys/kernel/config/cluster
</i>>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>>><i> What does this return?
</i>>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>>><i> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote:
</i>>>>>>>>>>>>>>>>>>><i> Hi,
</i>>>>>>>>>>>>>>>>>>><i> I have a 2 nodes ocfs2 cluster running UEK
</i>>>>>>>>>>>>>>>>>>><i> 2.6.32-100.0.19.el5,
</i>>>>>>>>>>>>>>>>>>><i> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5.
</i>>>>>>>>>>>>>>>>>>><i> My problem is that all the time when i try to run
</i>>>>>>>>>>>>>>>>>>><i> /etc/init.d/o2cb stop
</i>>>>>>>>>>>>>>>>>>><i> it fails with this error:
</i>>>>>>>>>>>>>>>>>>><i> Stopping O2CB cluster CLUSTER: Failed
</i>>>>>>>>>>>>>>>>>>><i> Unable to stop cluster as heartbeat region
</i>>>>>>>>>>>>>>>>>>><i> still active
</i>>>>>>>>>>>>>>>>>>><i> There is no active mount point. I tried to manually
</i>>>>>>>>>>>>>>>>>>><i> stop the heartdbeat
</i>>>>>>>>>>>>>>>>>>><i> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0
</i>>>>>>>>>>>>>>>>>>><i> ocfs2" (after finding
</i>>>>>>>>>>>>>>>>>>><i> the refs number with "ocfs2_hb_ctl -I -d
</i>>>>>>>>>>>>>>>>>>><i> /dev/mapper/volgr1-lvol0 ").
</i>>>>>>>>>>>>>>>>>>><i> But even if refs number is set to zero the "heartbeat
</i>>>>>>>>>>>>>>>>>>><i> region still
</i>>>>>>>>>>>>>>>>>>><i> active" occurs.
</i>>>>>>>>>>>>>>>>>>><i> How can i fix this?
</i>>>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>>>><i> Thank you in advance.
</i>>>>>>>>>>>>>>>>>>><i> Laurentiu.
</i>>>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>>>><i> _______________________________________________
</i>>>>>>>>>>>>>>>>>>><i> Ocfs2-users mailing list
</i>>>>>>>>>>>>>>>>>>><i> <a href="http://oss.oracle.com/mailman/listinfo/ocfs2-users">Ocfs2-users at oss.oracle.com</a>
</i>>>>>>>>>>>>>>>>>>><i> <a href="http://oss.oracle.com/mailman/listinfo/ocfs2-users">http://oss.oracle.com/mailman/listinfo/ocfs2-users</a>
</i>>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>>><i>
</i>>>>>>>>>>>>><i>
</i>>>>>>>>>>>><i>
</i>>>>>>>>>>><i>
</i>>>>>>>>>><i>
</i>>>>>>>>><i>
</i>>>>>>>><i>
</i>>>>>>><i>
</i>>>>>><i>
</i>>>>><i>
</i>>>><i>
</i>>><i>
</i>></pre>
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
</body>
</html>