[Ocfs2-users] 2 node cluster with shared LUN via FC

Sérgio Surkamp sergio at gruposinternet.com.br
Thu Nov 4 11:12:50 PDT 2010


About the Quota messages, its a new feature of ocfs2 version 1.6. ;)

http://oss.oracle.com/projects/ocfs2/news/article_23.html

Regards,
Sérgio

Em Thu, 04 Nov 2010 17:31:41 +0100
Manuel Bogner <manuel.bogner at geizhals.at> escreveu:

> Hi,
> 
> I just upgraded to a bpo kernel 2.6.32-bpo.5-amd64 and now it logs the
> following:
> 
> Nov  4 17:27:37 localhost kernel: [  487.098196] ocfs2_dlm: Nodes in
> domain ("8CEAFACAAE3B4A9BB6AAC6A7664EE094"): 0
> Nov  4 17:27:37 localhost kernel: [  487.105327] ocfs2: Mounting
> device (8,49) on (node 0, slot 0) with ordered data mode.
> Nov  4 17:28:11 localhost kernel: [  521.163897] o2net: accepted
> connection from node xen02b (num 1) at 192.168.100.101:7777
> 
> 
> Nov  4 17:27:59 localhost kernel: [  577.338311] ocfs2_dlm: Nodes in
> domain ("8CEAFACAAE3B4A9BB6AAC6A7664EE094"): 1
> Nov  4 17:27:59 localhost kernel: [  577.351868] ocfs2: Mounting
> device (8,49) on (node 1, slot 1) with ordered data mode.
> Nov  4 17:27:59 localhost kernel: [  577.352241]
> (2287,2):ocfs2_replay_journal:1607 Recovering node 0 from slot 0 on
> device (8,49)
> Nov  4 17:28:00 localhost kernel: [  578.505783]
> (2287,0):ocfs2_begin_quota_recovery:376 Beginning quota recovery in
> slot 0 Nov  4 17:28:00 localhost kernel: [  578.569121]
> (2241,0):ocfs2_finish_quota_recovery:569 Finishing quota recovery in
> slot 0 Nov  4 17:28:11 localhost kernel: [  589.359996] o2net:
> connected to node xen02a (num 0) at 192.168.100.100:7777
> 
> process description for the log:
> 
> node1: mount
> node2: mount
> 
> still the same but now it logs something about the quota.
> 
> (i also changed the network port for the traffic. now they are
> directly attached to each other.)
> 
> regards,
> Manuel
> 
> 
> Am 2010-11-04 15:49, schrieb Manuel Bogner:
> > Hi,
> > 
> > this could also be interesting. I tried mount /dev/sdd1 /shared/ on
> > both nodes at the same time with the following log result:
> > 
> > [  331.158166] OCFS2 1.5.0
> > [  336.155577] ocfs2_dlm: Nodes in domain
> > ("55A9D0B0050C484F97257788A3B9DDE0"): 0
> > [  336.166327] kjournald starting.  Commit interval 5 seconds
> > [  336.166327] ocfs2: Mounting device (8,49) on (node 0, slot 1)
> > with ordered data mode.
> > [  336.166664] (3239,0):ocfs2_replay_journal:1149 Recovering node 1
> > from slot 0 on device (8,49)
> > [  337.350942] kjournald starting.  Commit interval 5 seconds
> > [  351.142229] o2net: accepted connection from node xen02b (num 1)
> > at 10.0.0.102:7777
> > [  495.059065] o2net: no longer connected to node xen02b (num 1) at
> > 10.0.0.102:7777
> > 
> > 
> > [ 4841.036991] ocfs2_dlm: Nodes in domain
> > ("55A9D0B0050C484F97257788A3B9DDE0"): 1
> > [ 4841.039225] kjournald starting.  Commit interval 5 seconds
> > [ 4841.039997] ocfs2: Mounting device (8,49) on (node 1, slot 0)
> > with ordered data mode.
> > [ 4862.033837] o2net: connected to node xen02a (num 0) at
> > 10.0.0.168:7777 [ 5005.996422] o2net: no longer connected to node
> > xen02a (num 0) at 10.0.0.168:7777
> > [ 5005.998393] ocfs2: Unmounting device (8,49) on (node 1)
> > 
> > 
> > at the end xen02a was the only one that had it mounted.
> > 
> > regards,
> > Manuel
> > 
> > 
> > Am 2010-11-04 15:14, schrieb Manuel Bogner:
> >> Hi Sérgio,
> >>
> >> thanks for your quick answere.
> >>
> >> There are such lines after waiting a little bit, but still the same
> >> behavior.
> >>
> >> [ 2063.720211] o2net: connected to node xen02a (num 0) at
> >> 10.0.0.168:7777
> >>
> >> [ 1979.611076] o2net: accepted connection from node xen02b (num 1)
> >> at 10.0.0.102:7777
> >>
> >>
> >> xen02a:~# lsmod | egrep 'jbd|ocfs2|configfs'
> >> ocfs2                 395816  1
> >> ocfs2_dlmfs            23696  1
> >> ocfs2_stack_o2cb        9088  1
> >> ocfs2_dlm             197824  2 ocfs2_dlmfs,ocfs2_stack_o2cb
> >> ocfs2_nodemanager     208744  8
> >> ocfs2,ocfs2_dlmfs,ocfs2_stack_o2cb,ocfs2_dlm
> >> ocfs2_stackglue        16432  2 ocfs2,ocfs2_stack_o2cb
> >> configfs               29736  2 ocfs2_nodemanager
> >> jbd                    54696  2 ocfs2,ext3
> >>
> >> xen02a:~# netstat -an | grep 7777
> >> tcp        0      0 10.0.0.168:7777         0.0.0.0:*
> >> LISTEN
> >> tcp        0      0 10.0.0.168:7777         10.0.0.102:47547
> >> ESTABLISHED
> >>
> >> xen02b:~# lsmod | egrep 'jbd|ocfs2|configfs'
> >> ocfs2                 395816  1
> >> ocfs2_dlmfs            23696  1
> >> ocfs2_stack_o2cb        9088  1
> >> ocfs2_dlm             197824  2 ocfs2_dlmfs,ocfs2_stack_o2cb
> >> ocfs2_nodemanager     208744  8
> >> ocfs2,ocfs2_dlmfs,ocfs2_stack_o2cb,ocfs2_dlm
> >> ocfs2_stackglue        16432  2 ocfs2,ocfs2_stack_o2cb
> >> configfs               29736  2 ocfs2_nodemanager
> >> jbd                    54696  2 ocfs2,ext3
> >>
> >> xen02b:~# netstat -an | grep 7777
> >> tcp        0      0 10.0.0.102:7777         0.0.0.0:*
> >> LISTEN
> >> tcp        0      0 10.0.0.102:47547        10.0.0.168:7777
> >> ESTABLISHED
> >>
> >> There are no iptables-entries on both nodes as they are just
> >> test-servers.
> >>
> >> xen02a:~# uname -a
> >> Linux xen02a 2.6.26-2-xen-amd64 #1 SMP Thu Sep 16 16:32:15 UTC 2010
> >> x86_64 GNU/Linux
> >>
> >> xen02b:~# uname -a
> >> Linux xen02b 2.6.26-2-xen-amd64 #1 SMP Thu Sep 16 16:32:15 UTC 2010
> >> x86_64 GNU/Linux
> >>
> >> xen02b:~# cat /etc/default/o2cb
> >> #
> >> # This is a configuration file for automatic startup of the O2CB
> >> # driver.  It is generated by running /etc/init.d/o2cb configure.
> >> # On Debian based systems the preferred method is running
> >> # 'dpkg-reconfigure ocfs2-tools'.
> >> #
> >>
> >> # O2CB_ENABLED: 'true' means to load the driver on boot.
> >> O2CB_ENABLED=true
> >>
> >> # O2CB_STACK: The name of the cluster stack backing O2CB.
> >> O2CB_STACK=o2cb
> >>
> >> # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
> >> O2CB_BOOTCLUSTER=ocfs2
> >>
> >> # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered
> >> dead. O2CB_HEARTBEAT_THRESHOLD=31
> >>
> >> # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is
> >> considered dead.
> >> O2CB_IDLE_TIMEOUT_MS=30000
> >>
> >> # O2CB_KEEPALIVE_DELAY_MS: Max time in ms before a keepalive
> >> packet is sent O2CB_KEEPALIVE_DELAY_MS=2000
> >>
> >> # O2CB_RECONNECT_DELAY_MS: Min time in ms between connection
> >> attempts O2CB_RECONNECT_DELAY_MS=2000
> >>
> >>
> >> xen02b:~# mount
> >> /dev/sda1 on / type ext3 (rw,errors=remount-ro)
> >> tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
> >> proc on /proc type proc (rw,noexec,nosuid,nodev)
> >> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> >> procbususb on /proc/bus/usb type usbfs (rw)
> >> udev on /dev type tmpfs (rw,mode=0755)
> >> tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
> >> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
> >> configfs on /sys/kernel/config type configfs (rw)
> >> ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
> >> /dev/sdd1 on /shared type ocfs2 (rw,_netdev,heartbeat=local)
> >>
> >>
> >> regards,
> >> Manuel
> >>
> >> Am 2010-11-04 15:03, schrieb Sérgio Surkamp:
> >>> It seems that the o2net (network stack) is not running as you
> >>> should see the network messages in dmesg. Something like:
> >>>
> >>> xen02a kernel: o2net: connected to node xen02b (num 0) at
> >>> 10.0.0.102:7777
> >>>
> >>> Check your firewall and network configurations, also check if
> >>> [o2net] kernel thread is running and the tcp port 7777 is
> >>> listening in both nodes. If the thread is not running, check if
> >>> you have all needed kernel modules loaded:
> >>>
> >>> ocfs2
> >>> jbd
> >>> ocfs2_dlm
> >>> ocfs2_dlmfs
> >>> ocfs2_nodemanager
> >>> configfs
> >>>
> >>> Regards,
> >>> Sérgio
> >>>
> >>> Em Thu, 04 Nov 2010 14:12:11 +0100
> >>> Manuel Bogner <manuel.bogner at geizhals.at> escreveu:
> >>>
> >>>> sorry for the repost, but just saw that i mixed german and
> >>>> english... here is the corrected version:
> >>>>
> >>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> I'm trying to create a cluster out of 2 nodes. Both systems
> >>>> share the same LUN via FC and see it as /dev/sdd.
> >>>>
> >>>> /dev/sdd has one partition
> >>>>
> >>>> Disk /dev/sdd: 21.4 GB, 21474836480 bytes
> >>>> 64 heads, 32 sectors/track, 20480 cylinders
> >>>> Units = cylinders of 2048 * 512 = 1048576 bytes
> >>>> Disk identifier: 0xc29cb93d
> >>>>
> >>>>    Device Boot      Start         End      Blocks   Id  System
> >>>> /dev/sdd1               1       20480    20971504   83  Linux
> >>>>
> >>>> which is formated with
> >>>>
> >>>>   mkfs.ocfs2 -L ocfs2 /dev/sdd1
> >>>>
> >>>>
> >>>> Here is my /etc/ocfs2/cluster.conf
> >>>>
> >>>> node:
> >>>>     ip_port = 7777
> >>>>     ip_address = 10.0.0.168
> >>>>     number = 0
> >>>>     name = xen02a
> >>>>     cluster = ocfs2
> >>>>
> >>>> node:
> >>>>     ip_port = 7777
> >>>>     ip_address = 10.0.0.102
> >>>>     number = 1
> >>>>     name = xen02b
> >>>>     cluster = ocfs2
> >>>>
> >>>> cluster:
> >>>>     node_count = 2
> >>>>     name = ocfs2
> >>>>
> >>>>
> >>>> Everything seems to be fine:
> >>>>
> >>>> xen02a:~# /etc/init.d/o2cb status
> >>>> Driver for "configfs": Loaded
> >>>> Filesystem "configfs": Mounted
> >>>> Stack glue driver: Loaded
> >>>> Stack plugin "o2cb": Loaded
> >>>> Driver for "ocfs2_dlmfs": Loaded
> >>>> Filesystem "ocfs2_dlmfs": Mounted
> >>>> Checking O2CB cluster ocfs2: Online
> >>>> Heartbeat dead threshold = 31
> >>>>   Network idle timeout: 30000
> >>>>   Network keepalive delay: 2000
> >>>>   Network reconnect delay: 2000
> >>>> Checking O2CB heartbeat: Active
> >>>>
> >>>> And mounting the fs on each node works fine:
> >>>>
> >>>> /dev/sdd1 on /shared type ocfs2 (rw,_netdev,heartbeat=local)
> >>>>
> >>>> Both nodes can ping each other.
> >>>>
> >>>>
> >>>> xen02a:~# mounted.ocfs2 -d
> >>>> Device                FS     UUID
> >>>> Label /dev/sdd1             ocfs2
> >>>> 55a9d0b0-050c-484f-9725-7788a3b9dde0  ocfs2
> >>>>
> >>>> xen02b:~# mounted.ocfs2 -d
> >>>> Device                FS     UUID
> >>>> Label /dev/sdd1             ocfs2
> >>>> 55a9d0b0-050c-484f-9725-7788a3b9dde0  ocfs2
> >>>>
> >>>>
> >>>> Now the problem:
> >>>>
> >>>> I first mount the device on node1:
> >>>>
> >>>>  xen02a:~# mount -L ocfs2 /shared/
> >>>> => /dev/sdd1 on /shared type ocfs2 (rw,_netdev,heartbeat=local)
> >>>> without any errors.
> >>>>
> >>>> dmesg says:
> >>>>
> >>>> [   97.244054] ocfs2_dlm: Nodes in domain
> >>>> ("55A9D0B0050C484F97257788A3B9DDE0"): 0
> >>>> [   97.245869] kjournald starting.  Commit interval 5 seconds
> >>>> [   97.247045] ocfs2: Mounting device (8,49) on (node 0, slot 0)
> >>>> with ordered data mode.
> >>>>
> >>>> xen02a:~# mounted.ocfs2 -f
> >>>> Device                FS     Nodes
> >>>> /dev/sdd1             ocfs2  xen02a
> >>>>
> >>>> xen02a:~# echo "slotmap" | debugfs.ocfs2 -n /dev/sdd1
> >>>> 	Slot#   Node#
> >>>> 	    0       0
> >>>>
> >>>>
> >>>> Now I mount the device on the second node:
> >>>>
> >>>> xen02b:~# mount -L ocfs2 /shared/
> >>>> => /dev/sdd1 on /shared type ocfs2 (rw,_netdev,heartbeat=local)
> >>>>
> >>>> [  269.741168] OCFS2 1.5.0
> >>>> [  269.765171] ocfs2_dlm: Nodes in domain
> >>>> ("55A9D0B0050C484F97257788A3B9DDE0"): 1
> >>>> [  269.779620] kjournald starting.  Commit interval 5 seconds
> >>>> [  269.779620] ocfs2: Mounting device (8,49) on (node 1, slot 1)
> >>>> with ordered data mode.
> >>>> [  269.779620] (2953,0):ocfs2_replay_journal:1149 Recovering
> >>>> node 0 from slot 0 on device (8,49)
> >>>> [  270.950540] kjournald starting.  Commit interval 5 seconds
> >>>>
> >>>> xen02b:~# echo "slotmap" | debugfs.ocfs2 -n /dev/sdd1
> >>>> 	Slot#   Node#
> >>>> 	    1       1
> >>>>
> >>>> xen02b:~# mounted.ocfs2 -f
> >>>> Device                FS     Nodes
> >>>> /dev/sdd1             ocfs2  xen02b
> >>>>
> >>>>
> >>>> So the first mount seems to be gone and any changes on the fs on
> >>>> that node are not distributed.
> >>>>
> >>>> At the moment I have no idea what this could be. I hope someone
> >>>> can help me.
> >>>>
> >>>> regards,
> >>>> Manuel
> >>>>
> >>>> _______________________________________________
> >>>> Ocfs2-users mailing list
> >>>> Ocfs2-users at oss.oracle.com
> >>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Ocfs2-users mailing list
> >> Ocfs2-users at oss.oracle.com
> >> http://oss.oracle.com/mailman/listinfo/ocfs2-users
> >>
> > 
> > _______________________________________________
> > Ocfs2-users mailing list
> > Ocfs2-users at oss.oracle.com
> > http://oss.oracle.com/mailman/listinfo/ocfs2-users
> > 
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users


-- 
  .:''''':.
.:'        `     Sérgio Surkamp | Gerente de Rede
::    ........   sergio at gruposinternet.com.br
`:.        .:'
  `:,   ,.:'     *Grupos Internet S.A.*
    `: :'        R. Lauro Linhares, 2123 Torre B - Sala 201
     : :         Trindade - Florianópolis - SC
     :.'
     ::          +55 48 3234-4109
     :
     '           http://www.gruposinternet.com.br



More information about the Ocfs2-users mailing list