[Ocfs2-users] ocfs2 slows down with more servers in a cluster

David Schüler dschueler at resisto-it.de
Tue Jan 20 01:45:50 PST 2009


Hello to everybody on the list,

I have a problem regarding the file operations per second on an ocfs2 volume. I mean not the read or write speeds just the operations per second that the filesystem can handle.

Let's start at the beginning, here's what I have and what I'm doing:
I have a big fibre storage device with around 6TB of space, raid6 on sata-2 drives. I have 8 servers in my cluster connected to the storage via two fibre switches. Switches and HBAs are QLogic 4GBps. The Storage is an Infortrend EONStor. The servers are connected through GBit network switches. I think the hardware is not what causes the problem.

I'm using Ubuntu Server 8.04.1 LTS on all the machines. I create one 6TB partion with parted using the gpt disklabel. I format the partition with ocfs2. On all servers the o2cb service is running configured with the same heartbeat, network and so on timeout values. The same cluster.conf on every server as well.

I mount the ocfs2 volume on one machnine and everything works fine. I did some bonnie++ testing and got a write speed of 70MB/s and a read speed of 140MB/s.
I mount the volume on all servers and everything still works good. I can do concurrent reads and writes, no errors, no fencing. Nevertheless everythings feels slow. I did a rsync to bring 1,4TB of data to the volume. With the volume mounted on one server this takes around 1 1/2 day. With the volume mounted on all the servers I stopped it after two days and not more than 200GB synced. This made me wonder so I started some more tests.
bonnie++ with the volume mounted on all servers still reported a write speed of 70MB/s and a read speed of 130MB/s. So this seems not to be the problem. I took a look closer at the bonnie tests. The file operations per second are tested as well and I saw that with the ocfs2 volume mounted on one server it reaches around 2.500 operations/s with the volume mounted on all the server it slows down to 16 operations/s. For the mathematics: With a write speed of 70MB/s I should be able to get written 70 Files of 1MB size in just one second but I don't get it because not more than 16 files can be handled by the metadata. OK, still 16MB/s. Now, I don't have 1MB files, I have files of around 100kB each, so it slows my highspeed fibre storage down to 1,6MB/s (or even less).

I did some more testing and found out the more servers are in the cluster the slower everything gets. After reading nearly everything about ocfs2 I could find I did additional tests. I reduced the volume size to 500GB no longer using gpt as volume label. I tried different cluster and block sizes. I reduced the number of node slots. I used -T mail and -T datafiles. All these options in nearly every combination. Nothing helped. I even switched to Ubuntu Server 8.10 because of the newer ocfs2 kernel module but nothing changed.

I think I must be doing something wrong because I never read about such a problem before and if it was because of a bug more people would be reporting it I think.

Here are my bonnie++ tests:
Volume mounted on one server:
root at upload1:/daten# bonnie++ -d /daten -n 1 -u 0 -g 0
Using uid:0, gid:0.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03b       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
upload1          8G 40434  76 74376  27 42018  18 41361  76 137827  25 538.8   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
upload1,8G,40434,76,74376,27,42018,18,41361,76,137827,25,538.8,1,1,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++

Volume mounted on all 8 servers:
root at upload1:/daten# bonnie++ -d /daten -n 1 -u 0 -g 0
Using uid:0, gid:0.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03b       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
upload1          8G 39445  74 75634  29 42245  20 42198  76 128275  22 573.1   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1    16   1 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
upload1,8G,39445,74,75634,29,42245,20,42198,76,128275,22,573.1,0,1,16,1,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++

It's this Create/sec line where my problem with slow speeds seems to come from.

In some more tests I found out that Create/sec goes up to 2.500 with one server mounting the volume (the -n option has to be raised from 1 to 10 to see it).
I formated the volume with ext3 and got 70.000 Create/sec with ext3, so it seems that even with one server mounting the volume something is not right. With tunefs.ocfs2 I switched the ocfs2 volume to 'local' and got Create/sec up to 3.500 but this still seems very slow to me.

With this in mind I started testing on one machine with ocfs2 on a local drive but I can't get more than 2.500 Create/sec, no matter which cluster and block size I'm using. I can't test it with this volume on more than one server because I can't export the local drive to the other machines, but I'm sure it would get slower with more servers mounting the volume.

I did one last test yesterday. I installed CentOS 5.2 with the latest ocfs2 module and tools from oracle for EL5. I still can't get more than 2.500 Create/sec with one server mounting the volume. Next I'll do some testing with more than one CentOS server but perhaps someone has a good idea for me or a hint what I'm doing wrong. I'm sure ocfs2 can perform much better than in my tests.

Oh, I forgot: I even tested on different hardware, a dual xeon machnine with 4GB RAM as well as a core 2 duo machine with 2GB RAM, no changes. I used 32 bit and 64 bit versions of Ubuntu Server. As well, no changes.

I'm sorry for the long post, but I'm new to the list and I think every little peace of information could be helpfull.

Kind regards,
David


____________
Virus checked by G DATA AntiVirusKit
Version: AVF 19.226 from 18.01.2009
Virus news: www.antiviruslab.com



More information about the Ocfs2-users mailing list