[Ocfs2-users] 2 node cluster with shared LUN via FC

Sérgio Surkamp sergio at gruposinternet.com.br
Thu Nov 4 07:03:40 PDT 2010


It seems that the o2net (network stack) is not running as you should
see the network messages in dmesg. Something like:

xen02a kernel: o2net: connected to node xen02b (num 0) at
10.0.0.102:7777

Check your firewall and network configurations, also check if [o2net]
kernel thread is running and the tcp port 7777 is listening in both
nodes. If the thread is not running, check if you have all needed
kernel modules loaded:

ocfs2
jbd
ocfs2_dlm
ocfs2_dlmfs
ocfs2_nodemanager
configfs

Regards,
Sérgio

Em Thu, 04 Nov 2010 14:12:11 +0100
Manuel Bogner <manuel.bogner at geizhals.at> escreveu:

> sorry for the repost, but just saw that i mixed german and english...
> here is the corrected version:
> 
> 
> 
> Hi,
> 
> I'm trying to create a cluster out of 2 nodes. Both systems share the
> same LUN via FC and see it as /dev/sdd.
> 
> /dev/sdd has one partition
> 
> Disk /dev/sdd: 21.4 GB, 21474836480 bytes
> 64 heads, 32 sectors/track, 20480 cylinders
> Units = cylinders of 2048 * 512 = 1048576 bytes
> Disk identifier: 0xc29cb93d
> 
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdd1               1       20480    20971504   83  Linux
> 
> which is formated with
> 
>   mkfs.ocfs2 -L ocfs2 /dev/sdd1
> 
> 
> Here is my /etc/ocfs2/cluster.conf
> 
> node:
>     ip_port = 7777
>     ip_address = 10.0.0.168
>     number = 0
>     name = xen02a
>     cluster = ocfs2
> 
> node:
>     ip_port = 7777
>     ip_address = 10.0.0.102
>     number = 1
>     name = xen02b
>     cluster = ocfs2
> 
> cluster:
>     node_count = 2
>     name = ocfs2
> 
> 
> Everything seems to be fine:
> 
> xen02a:~# /etc/init.d/o2cb status
> Driver for "configfs": Loaded
> Filesystem "configfs": Mounted
> Stack glue driver: Loaded
> Stack plugin "o2cb": Loaded
> Driver for "ocfs2_dlmfs": Loaded
> Filesystem "ocfs2_dlmfs": Mounted
> Checking O2CB cluster ocfs2: Online
> Heartbeat dead threshold = 31
>   Network idle timeout: 30000
>   Network keepalive delay: 2000
>   Network reconnect delay: 2000
> Checking O2CB heartbeat: Active
> 
> And mounting the fs on each node works fine:
> 
> /dev/sdd1 on /shared type ocfs2 (rw,_netdev,heartbeat=local)
> 
> Both nodes can ping each other.
> 
> 
> xen02a:~# mounted.ocfs2 -d
> Device                FS     UUID
> Label /dev/sdd1             ocfs2
> 55a9d0b0-050c-484f-9725-7788a3b9dde0  ocfs2
> 
> xen02b:~# mounted.ocfs2 -d
> Device                FS     UUID
> Label /dev/sdd1             ocfs2
> 55a9d0b0-050c-484f-9725-7788a3b9dde0  ocfs2
> 
> 
> Now the problem:
> 
> I first mount the device on node1:
> 
>  xen02a:~# mount -L ocfs2 /shared/
> => /dev/sdd1 on /shared type ocfs2 (rw,_netdev,heartbeat=local)
> without any errors.
> 
> dmesg says:
> 
> [   97.244054] ocfs2_dlm: Nodes in domain
> ("55A9D0B0050C484F97257788A3B9DDE0"): 0
> [   97.245869] kjournald starting.  Commit interval 5 seconds
> [   97.247045] ocfs2: Mounting device (8,49) on (node 0, slot 0) with
> ordered data mode.
> 
> xen02a:~# mounted.ocfs2 -f
> Device                FS     Nodes
> /dev/sdd1             ocfs2  xen02a
> 
> xen02a:~# echo "slotmap" | debugfs.ocfs2 -n /dev/sdd1
> 	Slot#   Node#
> 	    0       0
> 
> 
> Now I mount the device on the second node:
> 
> xen02b:~# mount -L ocfs2 /shared/
> => /dev/sdd1 on /shared type ocfs2 (rw,_netdev,heartbeat=local)
> 
> [  269.741168] OCFS2 1.5.0
> [  269.765171] ocfs2_dlm: Nodes in domain
> ("55A9D0B0050C484F97257788A3B9DDE0"): 1
> [  269.779620] kjournald starting.  Commit interval 5 seconds
> [  269.779620] ocfs2: Mounting device (8,49) on (node 1, slot 1) with
> ordered data mode.
> [  269.779620] (2953,0):ocfs2_replay_journal:1149 Recovering node 0
> from slot 0 on device (8,49)
> [  270.950540] kjournald starting.  Commit interval 5 seconds
> 
> xen02b:~# echo "slotmap" | debugfs.ocfs2 -n /dev/sdd1
> 	Slot#   Node#
> 	    1       1
> 
> xen02b:~# mounted.ocfs2 -f
> Device                FS     Nodes
> /dev/sdd1             ocfs2  xen02b
> 
> 
> So the first mount seems to be gone and any changes on the fs on that
> node are not distributed.
> 
> At the moment I have no idea what this could be. I hope someone can
> help me.
> 
> regards,
> Manuel
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users


-- 
  .:''''':.
.:'        `     Sérgio Surkamp | Gerente de Rede
::    ........   sergio at gruposinternet.com.br
`:.        .:'
  `:,   ,.:'     *Grupos Internet S.A.*
    `: :'        R. Lauro Linhares, 2123 Torre B - Sala 201
     : :         Trindade - Florianópolis - SC
     :.'
     ::          +55 48 3234-4109
     :
     '           http://www.gruposinternet.com.br



More information about the Ocfs2-users mailing list