[Ocfs2-users] Trouble getting node to re-join two node cluster (OCFS2/DRBD Primary/Primary)

Thu Sep 15 13:42:54 PDT 2011

open("/dev/drbd0", O_RDONLY|O_DIRECT) = -1 EMEDIUMTYPE (Wrong medium type)

drbd_open()
...
         if (mdev->state.role != R_PRIMARY) {
                 if (mode & FMODE_WRITE)
                         rv = -EROFS;
                 else if (!allow_oos)
                         rv = -EMEDIUMTYPE;
         }
...

So the failure appears to be emanating from drbd. There seems
to be a allow_oos module param that is not 0. I have no idea
what this param does. Also, am reading current mainline. 2.6.35 may
be different.

On 09/15/2011 01:26 PM, Mike Reid wrote:
> Hello all,
>
> ** I have also posted this in the pacemaker list, but I have a feeling it's
> more OCFS2 specific **
>
> We have a two-node cluster still in development that has been running fine
> for weeks (little to no traffic). I made some updates to our CIB recently,
> and everything seemed just fine.
>
> Yesterday I attempted to untar ~1.5GB to the OCFS2/DRBD volume, and once it
> was complete one of the nodes had become completely disconnected and I
> haven't been able to reconnect since.
>
> DRBD is working fine, everything is UpToDate and I can get both nodes in
> Primary/Primary, but when it comes down to starting OCFS2 and mounting the
> volume, I'm left with:
>
>> resFS:0_start_0 (node=node1, call=21, rc=1, status=complete): unknown error
> I am using "pcmk" as the cluster_stack, and letting Pacemaker control
> everything...
>
> The last time this happened the only way I was able to resolve it was to
> reformat the device (via mkfs.ocfs2 -F). I don't think I should have to do
> this, underlying blocks seem fine, and one of the nodes is running just
> fine. The (currently) unmounted node is staying in sync as far as DRBD is
> concerned.
>
> Here's some detail that hopefully will help, please let me know if there's
> anything else I can provide to help know the best way to get this node back
> "online":
>
>
> Ubuntu 10.10 / Kernel 2.6.35
>
> Pacemaker 1.0.9.1
> Corosync 1.2.1
> Cluster Agents 1.0.3 (Heartbeat)
> Cluster Glue 1.0.6
> OpenAIS 1.1.2
>
> DRBD 8.3.10
> OCFS2 1.5.0
>
> cat /sys/fs/ocfs2/cluster_stack = pcmk
>
> node1: mounted.ocfs2 -d
>
> Device                FS     UUID                                  Label
> /dev/sda3             ocfs2  fe4273e1-f866-4541-bbcf-66c5dfd496d6
>
> node2: mounted.ocfs2 -d
>
> Device                FS     UUID                                  Label
> /dev/sda3             ocfs2  d6f7cc6d-21d1-46d3-9792-bc650736a5ef
> /dev/drbd0            ocfs2  d6f7cc6d-21d1-46d3-9792-bc650736a5ef
>
> * NOTES:
> - Both nodes are identical, in fact one node is a direct mirror (hdd clone)
> - I have attached the CIB (crm configure edit contents) and mount trace
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110915/e6d1c5d7/attachment.html