[Ocfs2-users] mount ocfs2 after both nodes self fence

Andrew (Anything) anything at starstrike.net
Tue Mar 24 05:46:17 PDT 2009


Hi All.

 

Currently running ocfs2 in a dual node setup over dual primary DRBD with a
gigabit backend for a webserver environment.

Read performance is as expected, write performance is absolutely terrible
(ie: 22 file modifications per second).

The gigabit crossover achieves its full capacity easily, but has an avg 77ms
latency.

So Im looking to change to infiniband with some hardware from ebay and
hopefully thatll solve the slow problem. Do you think it will solve my bad
write performance issues?

 

My next problem is where if too many applications are queued to write to the
partition ocfs goes and restarts the system (obviously cause it hasn't
communicated with the other node in quite a while, currently configured for
60 seconds).

And cause im only running two node, the other one goes and kills itself too.
(Im in the process of setting up a third node via iscsi, but haven't got
there yet)

When the two come back up, and drbd is finished syncing I go to manually
re-mount one of the servers.

But when I do it restarts itself again, and again, and again etc.

All I see in messages/dmesg is something like this, then the server goes and
resets itself.

 (3756,3):ocfs2_find_slot:502 slot 1 is already allocated to this node!

 (3756,3):ocfs2_check_volume:1753 File system was not unmounted cleanly,
recovering volume.

The slotmap has both nodes in it, even tho they aren't mounted.

# echo "slotmap" | debugfs.ocfs2 -n /dev/drbd0

                 Slot#   Node#

                    0       1

1           0

Currently im fsck'ing the partition, which replayed the journals of both
nodes (contrary to the error message you see above).

Then after a couple of failures (each time resetting one of the servers) I
end up trying to mount with localflocks.

It seems that half the time localflocks works, it mounts the partition. I
can then unmount and remount normally, and happy sailing.

But the other half the time the system resets itself again.

 

Im not sure how im supposed to remount the partition properly in this
scenario, can someone help me?

 

Btw:

Linux- 2.6.28

drbd 8.2.7

elevator=deadline

I hope I included enough relevant information.

 

 

Andrew.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090324/c49aa583/attachment.html 


More information about the Ocfs2-users mailing list