[Ocfs2-users] Slow OCFS2 on very high-end hardware

Sun Dec 4 01:56:12 PST 2011

-----Oryginalna wiadomość----- 
From: Marek Królikowski
Sent: Sunday, December 04, 2011 10:48 AM
To: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] Slow OCFS2 on very high-end hardware
On Sat, Dec 03, 2011 at 08:32:30PM +0100, Marek Królikowski wrote:
>> Hello
>> Today i create a cluster with OCFS2.
>> I name servers MAIL1 and MAIL2
>> Both connect via HBA card with 2 links 4Gbit/s and EMC storage with FC 
>> RAID10.
>> Both connect to this same Cisco switch 1Gbit/s line.
>> Hardware is awsome but ocfs2 work verrrry slow.
>> I use Gentoo Linux with Kernel 3.0.6 and ocfs2-tools-1.6.4 that will be 
>> postfix/imap/pop3 cluster with maildir support so there will be many many 
>> directores and little files.
>> I link /home to my ocfs2 and do few tests but work verry slow...
>> When i write any file on server MAIL1 and try check mailbox from MAIL2 
>> working amazing slow...
>I've gotta ask, what is "amazingly slow" to you?  A cluster
>filesystem accessing the same files from two places necessarily is
>slower than local access.  But if it is slow enough that you notice it
>by hand, it's probably something in configuration.
>Did you select the 'mail' filesystem type when creating the
>filesystem?  This probably shouldn't affect your simple test, but it
>will absolutely help as your system grows.
When i copy any file from/to MAIL1 and enter to home directory (using MC)
where is 7000 users on MAIL2 i need wait 30+ sec.... at normal when  i don`t
copy write on ocfs2 i wait 3 sec.

>> MAIL1 ~ # cat /proc/mounts
>> /dev/mapper/EMC /mnt/EMC ocfs2 
>> rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,usrquota,coherency=full,user_xattr,acl 
>> 0 0
>> MAIL2 ~ # cat /proc/mounts
>> /dev/mapper/EMC /mnt/EMC ocfs2 
>> rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,usrquota,coherency=full,user_xattr,acl 
>> 0 0

>Your hardware looks just fine, though I have to ask why you have
>device-mapper in there.  Is it just for multipath support, or are you
>doing other things?

Yes there is Multipath like i say there is 2 HBA card connect to 2 another
FC  switch, every FC switch got 2 dedicated links to storage - so i see 4
links - You think i can do something wrong in configuration of Multipath?
This is possible because i do 1 times multipath on Linux.
After don`t use multipath i see in dmesg information about 4 HDD:
sd 5:0:0:0: [sdb] 3377096448 512-byte logical blocks: (1.72 TB/1.57 TiB)
sd 5:0:0:0: [sdb] Write Protect is off
sd 5:0:0:0: [sdb] Mode Sense: 87 00 00 08
sd 5:0:1:0: Attached scsi generic sg3 type 0
sd 5:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't
support DPO or FUA
sd 5:0:1:0: [sdc] 3377096448 512-byte logical blocks: (1.72 TB/1.57 TiB)
sd 5:0:1:0: [sdc] Write Protect is off
sd 5:0:1:0: [sdc] Mode Sense: 87 00 00 08
sdb: unknown partition table
sd 5:0:1:0: [sdc] Write cache: disabled, read cache: enabled, doesn't
support DPO or FUA
sd 5:0:0:0: [sdb] Attached SCSI disk
sdc: unknown partition table
sd 5:0:1:0: [sdc] Attached SCSI disk
sd 6:0:0:0: Attached scsi generic sg4 type 0
sd 6:0:0:0: [sdd] 3377096448 512-byte logical blocks: (1.72 TB/1.57 TiB)
sd 6:0:0:0: [sdd] Write Protect is off
sd 6:0:0:0: [sdd] Mode Sense: 87 00 00 08
sd 6:0:0:0: [sdd] Write cache: disabled, read cache: enabled, doesn't
support DPO or FUA
sd 6:0:1:0: Attached scsi generic sg5 type 0
sd 6:0:1:0: [sde] 3377096448 512-byte logical blocks: (1.72 TB/1.57 TiB)
sdd: unknown partition table
sd 6:0:1:0: [sde] Write Protect is off
sd 6:0:1:0: [sde] Mode Sense: 87 00 00 08
sd 6:0:1:0: [sde] Write cache: disabled, read cache: enabled, doesn't
support DPO or FUA
sd 6:0:0:0: [sdd] Attached SCSI disk
sde: unknown partition table
sd 6:0:1:0: [sde] Attached SCSI disk
sd 5:0:1:0: emc: detected Clariion CX4-240, flags 0
sd 5:0:1:0: emc: ALUA failover mode detected
sd 5:0:1:0: emc: connected to SP B Port 0 (owned, default SP B)
sd 6:0:1:0: emc: detected Clariion CX4-240, flags 0
sd 6:0:1:0: emc: ALUA failover mode detected
sd 6:0:1:0: emc: connected to SP B Port 1 (owned, default SP B)
sd 6:0:0:0: emc: detected Clariion CX4-240, flags 0
sd 6:0:0:0: emc: ALUA failover mode detected
sd 6:0:0:0: emc: connected to SP A Port 1 (bound, default SP B)
sd 5:0:1:0: emc: ALUA failover mode detected
sd 5:0:1:0: emc: at SP B Port 0 (owned, default SP B)
sd 6:0:1:0: emc: ALUA failover mode detected
sd 6:0:1:0: emc: at SP B Port 1 (owned, default SP B)

Configuration multipath looks like this:
MAIL1 ~ # cat /etc/multipath.conf
defaults {
udev_dir                /dev
polling_interval        15
selector                "round-robin 0"
path_grouping_policy    group_by_prio
failback                5
path_checker            tur
prio_callout            "/sbin/mpath_prio_emc /dev/%n"
rr_min_io               100
rr_weight               uniform
no_path_retry           queue
user_friendly_names     yes
}
blacklist {
devnode cciss
devnode fd
devnode hd
devnode md
devnode sr
devnode scd
devnode st
devnode ram
devnode raw
devnode loop
devnode sda
devnode sdb
}

multipaths {
multipath {
wwid 360060160ac652400a81e5f17a201e111
alias EMC
}
devices {
device {
"IBM     "
"1815      FAStT "
}
}
}

Thanks

Edit:
I don`t give u one important information from multipath:
MAIL1 ~ # multipath -l
EMC (360060160ac652400a81e5f17a201e111) dm-0 DGC,RAID 10
size=1.6T features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=-1 status=active
| |- 5:0:1:0 sdc 8:32 active undef running
| `- 6:0:1:0 sde 8:64 active undef running
`-+- policy='round-robin 0' prio=-1 status=enabled
  `- 6:0:0:0 sdd 8:48 active undef running