<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    #rpm -qa |grep ocfs2<br>
    ocfs2console-1.6.3-2.el5<br>
    ocfs2-tools-1.6.3-2.el5<br>
    <br>
    Just let me know if I can give more details to find the problem. I
    will move ocfs2 into production in the next weeks.<br>
    <br>
    <br>
    On 10/23/2011 22:49, Sunil Mushran wrote:
    <blockquote cite="mid:4EA46FC0.3090505@oracle.com" type="cite">
      <meta content="text/html; charset=ISO-8859-1"
        http-equiv="Content-Type">
      <title></title>
      Are you sure you have ocfs2-tools-1.6.3? I remember we had an<br>
      issue with this with an earlier release... 1.6.1/.2.<br>
      <br>
      On 10/23/2011 10:43 AM, Laurentiu Gosu wrote:
      <blockquote cite="mid:4EA45258.2030309@easic.ro" type="cite">
        <meta content="text/html; charset=ISO-8859-1"
          http-equiv="Content-Type">
        hmm..<br>
        #ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D<br>
        0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs<br>
        <b>BUT:</b><br>
        #ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2<br>
        ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping
        heartbeat<br>
        I can still kill the ref using device name (-d).<br>
        <br>
        On 10/23/2011 17:57, Sunil Mushran wrote:
        <blockquote cite="mid:4EA42B41.9070607@oracle.com" type="cite">
          <meta content="text/html; charset=ISO-8859-1"
            http-equiv="Content-Type">
          <title></title>
          I think it stops by uuid. So try doing this the next time.<br>
          You are encountering some issue that we have not seen before.<br>
          ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2<br>
          <br>
          On 10/23/2011 05:32 AM, Laurentiu Gosu wrote:
          <blockquote cite="mid:4EA40944.8080106@easic.ro" type="cite">
            <meta content="text/html; charset=ISO-8859-1"
              http-equiv="Content-Type">
            Hi Sunil,<br>
            Sorry for my late reply, i just had time today to start from
            scratch and test.<br>
            I rebuilt my environment(2 nodes connected to a SAN via
            iSCSI+multipath). I still have the issue that the heartbeat
            is active after I umount my ocfs2 volume. <br>
            /etc/init.d/o2cb stop<br>
            Stopping O2CB cluster CLUST: Failed<br>
            Unable to stop cluster as heartbeat region still active<br>
            <br>
            ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0<br>
            0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs<br>
            <br>
            After i manually kill the ref (ocfs2_hb_ctl -K -d
            /dev/mapper/volgr1-lvol0 ocfs2 ) i can stop successfully
            o2cb. I can live with that but why doesn't it stop
            automatically? As i understand, hearbeat should be started
            and stopped once the volume gets mounted/umounted.<br>
            <br>
            br,<br>
            Laurentiu.<br>
            <br>
            On 10/19/2011 02:28, Sunil Mushran wrote:
            <blockquote cite="mid:4E9E0B92.8060104@oracle.com"
              type="cite">
              <meta content="text/html; charset=ISO-8859-1"
                http-equiv="Content-Type">
              Manual delete will only work if there are no references.
              In your case<br>
              there are references.<br>
              <br>
              You may want to start both nodes from scratch. Do not
              start/stop<br>
              heartbeat manually. Also, do not force-format.<br>
              <br>
              On 10/18/2011 03:54 PM, Laurentiu Gosu wrote:
              <blockquote cite="mid:4E9E03B7.4080603@easic.ro"
                type="cite">
                <meta content="text/html; charset=ISO-8859-1"
                  http-equiv="Content-Type">
                OK, i rebooted one of the nodes(both had similar
                issues); . But something is still fishy.<br>
                - i mounted the device: mount -t ocfs2 /dev/volgr1/lvol0
                /mnt/tmp/<br>
                - i unmount it: umount /mnt/tmp/<br>
                - tried to stop o2cb:&nbsp; /etc/init.d/o2cb stop<br>
                Stopping O2CB cluster CLUSTER: Failed<br>
                Unable to stop cluster as heartbeat region still active<br>
                - ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D<br>
                0C4AB55FE9314FA5A9F81652FDB9B22D: 1 refs<br>
                -&nbsp; ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D<br>
                ocfs2_hb_ctl: File not found by ocfs2_lookup while
                stopping heartbeat<br>
                - ls -Rl /sys/kernel/config/cluster/CLUSTER/heartbeat/<br>
                /sys/kernel/config/cluster/CLUSTER/heartbeat/:<br>
                total 0<br>
                drwxr-xr-x 2 root root&nbsp;&nbsp;&nbsp; 0 Oct 19 01:50
                0C4AB55FE9314FA5A9F81652FDB9B22D<br>
                -rw-r--r-- 1 root root 4096 Oct 19 01:40 dead_threshold<br>
                <br>
/sys/kernel/config/cluster/CLUSTER/heartbeat/0C4AB55FE9314FA5A9F81652FDB9B22D:<br>
                total 0<br>
                -rw-r--r-- 1 root root 4096 Oct 19 01:50 block_bytes<br>
                -rw-r--r-- 1 root root 4096 Oct 19 01:50 blocks<br>
                -rw-r--r-- 1 root root 4096 Oct 19 01:50 dev<br>
                -r--r--r-- 1 root root 4096 Oct 19 01:50 pid<br>
                -rw-r--r-- 1 root root 4096 Oct 19 01:50 start_block<br>
                <br>
                - i cannot manually delete
/sys/kernel/config/cluster/CLUSTER/heartbeat/0C4AB55FE9314FA5A9F81652FDB9B22D/<br>
                <br>
                PS: i'm going to sleep now, i have to be up in a few
                hours. We can continue tomorrow if it's ok with you. <br>
                Thank you for your help.<br>
                <br>
                Laurentiu.<br>
                <br>
                On 10/19/2011 01:33, Sunil Mushran wrote:
                <blockquote cite="mid:4E9DFEB0.9010206@oracle.com"
                  type="cite">
                  <meta content="text/html; charset=ISO-8859-1"
                    http-equiv="Content-Type">
                  One way this can happen is if one starts the hb
                  manually and then force<br>
                  formats on that volume. The format will generate a new
                  uuid. Once that<br>
                  happens, the hb tool cannot map the region to the
                  device and thus fail<br>
                  to stop it. Right now the easiest option on this box
                  is resetting it.<br>
                  <br>
                  On 10/18/2011 03:24 PM, Laurentiu Gosu wrote:
                  <blockquote cite="mid:4E9DFC93.1050109@easic.ro"
                    type="cite">
                    <meta content="text/html; charset=ISO-8859-1"
                      http-equiv="Content-Type">
                    Yes, i did reformat it(even more than once i think,
                    last week). This is a pre-production system and i'm
                    trying various options before moving into real life.<br>
                    <br>
                    <br>
                    On 10/19/2011 01:19, Sunil Mushran wrote:
                    <blockquote cite="mid:4E9DFB83.40603@oracle.com"
                      type="cite">
                      <meta content="text/html; charset=ISO-8859-1"
                        http-equiv="Content-Type">
                      Did you reformat the volume recently? or, when did
                      you format last?<br>
                      <br>
                      On 10/18/2011 03:13 PM, Laurentiu Gosu wrote:
                      <blockquote cite="mid:4E9DFA03.8030405@easic.ro"
                        type="cite">
                        <meta content="text/html; charset=ISO-8859-1"
                          http-equiv="Content-Type">
                        well..this is weird<br>
                        ls /sys/kernel/config/cluster/CLUSTER/heartbeat/<br>
                        <b>918673F06F8F4ED188DDCE14F39945F6</b>&nbsp;
                        dead_threshold<br>
                        <br>
                        looks like we have different UUIDs. Where is
                        this coming from??<br>
                        <br>
                        ocfs2_hb_ctl -I -u
                        918673F06F8F4ED188DDCE14F39945F6<br>
                        918673F06F8F4ED188DDCE14F39945F6: 1 refs<br>
                        <br>
                        <br>
                        On 10/19/2011 01:04, Sunil Mushran wrote:
                        <blockquote
                          cite="mid:4E9DF7D0.7090404@oracle.com"
                          type="cite">Let's do it by hand. <br>
                          rm -rf
                          /sys/kernel/config/cluster/.../heartbeat/<b>0C4AB55FE9314FA5A9F81652FDB9B22D










                          </b><br>
                          <br>
                          On 10/18/2011 02:52 PM, Laurentiu Gosu wrote:
                          <br>
                          <blockquote type="cite">&nbsp;ocfs2_hb_ctl -K -u
                            0C4AB55FE9314FA5A9F81652FDB9B22D <br>
                            ocfs2_hb_ctl: File not found by ocfs2_lookup
                            while stopping heartbeat <br>
                            <br>
                            No improvment :( <br>
                            <br>
                            <br>
                            On 10/19/2011 00:50, Sunil Mushran wrote: <br>
                            <blockquote type="cite">See if this cleans
                              it up. <br>
                              ocfs2_hb_ctl -K -u
                              0C4AB55FE9314FA5A9F81652FDB9B22D <br>
                              <br>
                              On 10/18/2011 02:44 PM, Laurentiu Gosu
                              wrote: <br>
                              <blockquote type="cite">ocfs2_hb_ctl -I -u
                                0C4AB55FE9314FA5A9F81652FDB9B22D <br>
                                0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs
                                <br>
                                <br>
                                <br>
                                On 10/19/2011 00:43, Sunil Mushran
                                wrote: <br>
                                <blockquote type="cite">ocfs2_hb_ctl -l
                                  -u 0C4AB55FE9314FA5A9F81652FDB9B22D <br>
                                  <br>
                                  On 10/18/2011 02:40 PM, Laurentiu Gosu
                                  wrote: <br>
                                  <blockquote type="cite">mounted.ocfs2
                                    -d <br>
                                    Device&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FS&nbsp;&nbsp;&nbsp;&nbsp; Stack&nbsp;
                                    UUID&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
                                    Label <br>
                                    /dev/mapper/volgr1-lvol0&nbsp; ocfs2&nbsp;
                                    o2cb&nbsp;&nbsp;
                                    0C4AB55FE9314FA5A9F81652FDB9B22D&nbsp;
                                    ocfs2 <br>
                                    <br>
                                    mounted.ocfs2 -f <br>
                                    Device&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FS&nbsp;&nbsp;&nbsp;&nbsp; Nodes <br>
                                    /dev/mapper/volgr1-lvol0&nbsp; ocfs2&nbsp;
                                    ro02xsrv001 <br>
                                    <br>
                                    ro02xsrv001 = the other node in the
                                    cluster. <br>
                                    <br>
                                    By the way, there is no /dev/md-2 <br>
                                    &nbsp;ls /dev/dm-* <br>
                                    /dev/dm-0&nbsp; /dev/dm-1 <br>
                                    <br>
                                    <br>
                                    On 10/19/2011 00:37, Sunil Mushran
                                    wrote: <br>
                                    <blockquote type="cite">So it is not
                                      mounted. But we still have a hb
                                      thread because <br>
                                      hb could not be stopped during
                                      umount. The reason for that <br>
                                      could be the same that causes
                                      ocfs2_hb_ctl to fail. <br>
                                      <br>
                                      Do: <br>
                                      mounted.ocfs2 -d <br>
                                      <br>
                                      On 10/18/2011 02:32 PM, Laurentiu
                                      Gosu wrote: <br>
                                      <blockquote type="cite">ls -lR
                                        /sys/kernel/debug/ocfs2 <br>
                                        /sys/kernel/debug/ocfs2: <br>
                                        total 0 <br>
                                        <br>
                                        ls -lR /sys/kernel/debug/o2dlm <br>
                                        /sys/kernel/debug/o2dlm: <br>
                                        total 0 <br>
                                        <br>
                                        ocfs2_hb_ctl -I -d /dev/dm-2 <br>
                                        ocfs2_hb_ctl: Device name
                                        specified was not found while
                                        reading uuid <br>
                                        <br>
                                        There is no /dev/dm-2 mounted. <br>
                                        <br>
                                        <br>
                                        On 10/19/2011 00:27, Sunil
                                        Mushran wrote: <br>
                                        <blockquote type="cite">mount -t
                                          debugfs debugfs
                                          /sys/kernel/debug <br>
                                          <br>
                                          Then list that dir. <br>
                                          <br>
                                          Also, do: <br>
                                          ocfs2_hb_ctl -l -d /dev/dm-2 <br>
                                          <br>
                                          Be careful before killing. We
                                          want to be sure that dev is
                                          not mounted. <br>
                                          <br>
                                          On 10/18/2011 02:23 PM,
                                          Laurentiu Gosu wrote: <br>
                                          <blockquote type="cite">Again&nbsp;&nbsp;

                                            the outputs: <br>
                                            &nbsp;cat
/sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev<br>
                                            dm-2 <br>
                                            ---&gt;here should be
                                            volgr1-lvol0 i guess? <br>
                                            <br>
                                            ls -lR
                                            /sys/kernel/debug/ocfs2 <br>
                                            ls: /sys/kernel/debug/ocfs2:
                                            No such file or directory <br>
                                            <br>
                                            ls -lR
                                            /sys/kernel/debug/o2dlm <br>
                                            ls: /sys/kernel/debug/o2dlm:
                                            No such file or directory <br>
                                            <br>
                                            I think i have to enable
                                            debug first somehow..? <br>
                                            <br>
                                            Laurentiu. <br>
                                            <br>
                                            On 10/19/2011 00:17, Sunil
                                            Mushran wrote: <br>
                                            <blockquote type="cite">What
                                              does this return? <br>
                                              cat
/sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev<br>
                                              <br>
                                              Also, do: <br>
                                              ls -lR
                                              /sys/kernel/debug/ocfs2 <br>
                                              ls -lR
                                              /sys/kernel/debug/o2dlm <br>
                                              <br>
                                              On 10/18/2011 02:14 PM,
                                              Laurentiu Gosu wrote: <br>
                                              <blockquote type="cite">Here

                                                is the output: <br>
                                                <br>
                                                ls -lR
                                                /sys/kernel/config/cluster
                                                <br>
                                                /sys/kernel/config/cluster:


                                                <br>
                                                total 0 <br>
                                                drwxr-xr-x 4 root root 0
                                                Oct 19 00:12 CLUSTER <br>
                                                <br>
                                                /sys/kernel/config/cluster/CLUSTER:






                                                <br>
                                                total 0 <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                fence_method <br>
                                                drwxr-xr-x 3 root
                                                root&nbsp;&nbsp;&nbsp; 0 Oct 19 00:12
                                                heartbeat <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                idle_timeout_ms <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                keepalive_delay_ms <br>
                                                drwxr-xr-x 4 root
                                                root&nbsp;&nbsp;&nbsp; 0 Oct 11 20:23
                                                node <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                reconnect_delay_ms <br>
                                                <br>
                                                /sys/kernel/config/cluster/CLUSTER/heartbeat:










                                                <br>
                                                total 0 <br>
                                                drwxr-xr-x 2 root
                                                root&nbsp;&nbsp;&nbsp; 0 Oct 19 00:12
                                                918673F06F8F4ED188DDCE14F39945F6
                                                <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                dead_threshold <br>
                                                <br>
/sys/kernel/config/cluster/CLUSTER/heartbeat/<b>918673F06F8F4ED188DDCE14F39945F6</b>:
                                                <br>
                                                total 0 <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                block_bytes <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12 blocks
                                                <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12 dev <br>
                                                -r--r--r-- 1 root root
                                                4096 Oct 19 00:12 pid <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                start_block <br>
                                                <br>
                                                /sys/kernel/config/cluster/CLUSTER/node:








                                                <br>
                                                total 0 <br>
                                                drwxr-xr-x 2 root root 0
                                                Oct 19 00:12 ro02xsrv001
                                                <br>
                                                drwxr-xr-x 2 root root 0
                                                Oct 19 00:12 ro02xsrv002
                                                <br>
                                                <br>
                                                /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001:










                                                <br>
                                                total 0 <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                ipv4_address <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                ipv4_port <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12 local
                                                <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12 num <br>
                                                <br>
                                                /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002:










                                                <br>
                                                total 0 <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                ipv4_address <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12
                                                ipv4_port <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12 local
                                                <br>
                                                -rw-r--r-- 1 root root
                                                4096 Oct 19 00:12 num <br>
                                                <br>
                                                <br>
                                                <br>
                                                <br>
                                                On 10/19/2011 00:12,
                                                Sunil Mushran wrote: <br>
                                                <blockquote type="cite">ls

                                                  -lR
                                                  /sys/kernel/config/cluster
                                                  <br>
                                                  <br>
                                                  What does this return?
                                                  <br>
                                                  <br>
                                                  On 10/18/2011 02:05
                                                  PM, Laurentiu Gosu
                                                  wrote: <br>
                                                  <blockquote
                                                    type="cite">Hi, <br>
                                                    I have a 2 nodes
                                                    ocfs2 cluster
                                                    running UEK
                                                    2.6.32-100.0.19.el5,
                                                    <br>
                                                    ocfs2console-1.6.3-2.el5,



                                                    ocfs2-tools-1.6.3-2.el5.


                                                    <br>
                                                    My problem is that
                                                    all the time when i
                                                    try to run
                                                    /etc/init.d/o2cb
                                                    stop <br>
                                                    it fails with this
                                                    error: <br>
                                                    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Stopping O2CB
                                                    cluster CLUSTER:
                                                    Failed <br>
                                                    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Unable to stop
                                                    cluster as heartbeat
                                                    region still active
                                                    <br>
                                                    There is no active
                                                    mount point. I tried
                                                    to manually stop the
                                                    heartdbeat <br>
                                                    with "ocfs2_hb_ctl
                                                    -K -d
                                                    /dev/mapper/volgr1-lvol0
                                                    ocfs2" (after
                                                    finding <br>
                                                    the refs number with
                                                    "ocfs2_hb_ctl -I -d
                                                    /dev/mapper/volgr1-lvol0

                                                    "). <br>
                                                    But even if refs
                                                    number is set to
                                                    zero the "heartbeat
                                                    region still <br>
                                                    active" occurs. <br>
                                                    How can i fix this?
                                                    <br>
                                                    <br>
                                                    Thank you in
                                                    advance. <br>
                                                    Laurentiu. <br>
                                                    <br>
                                                    <br>
                                                    _______________________________________________










                                                    <br>
                                                    Ocfs2-users mailing
                                                    list <br>
                                                    <a
                                                      moz-do-not-send="true"
class="moz-txt-link-abbreviated"
                                                      href="mailto:Ocfs2-users@oss.oracle.com">Ocfs2-users@oss.oracle.com</a>
                                                    <br>
                                                    <a
                                                      moz-do-not-send="true"
class="moz-txt-link-freetext"
                                                      href="http://oss.oracle.com/mailman/listinfo/ocfs2-users">http://oss.oracle.com/mailman/listinfo/ocfs2-users</a>
                                                    <br>
                                                  </blockquote>
                                                  <br>
                                                </blockquote>
                                                <br>
                                              </blockquote>
                                              <br>
                                            </blockquote>
                                            <br>
                                          </blockquote>
                                          <br>
                                        </blockquote>
                                        <br>
                                      </blockquote>
                                      <br>
                                    </blockquote>
                                    <br>
                                  </blockquote>
                                  <br>
                                </blockquote>
                                <br>
                              </blockquote>
                              <br>
                            </blockquote>
                            <br>
                          </blockquote>
                          <br>
                        </blockquote>
                        <br>
                      </blockquote>
                      <br>
                    </blockquote>
                    <br>
                  </blockquote>
                  <br>
                </blockquote>
                <br>
              </blockquote>
              <br>
            </blockquote>
            <br>
          </blockquote>
          <br>
        </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>