[Ocfs2-users] Slow on open()

Somsak Sriprayoonsakul somsaks at gmail.com
Tue Jan 19 19:53:04 PST 2010


No it's not cciss. This is our modprobe.conf

alias eth0 bnx2
alias eth1 bnx2
alias scsi_hostadapter megaraid_sas
alias scsi_hostadapter1 ata_piix
alias scsi_hostadapter2 qla2xxx

Could you suggest us how could we isolate this problem?

2010/1/20 Sunil Mushran <sunil.mushran at oracle.com>

> Is that using the cciss driver? I have heared of similar sporadic
> performance
> issues with the cciss driver. I doubt this is an ocfs2 issue. I would
> recommend
> you ping some support people who can look at your io setup more closely.
>
> Somsak Sriprayoon sakul wrote:
>
>> Hello,
>>
>> We are using OCFS2 version 1.4.3 on CentOS5, x86_64 with 8GB memory. The
>> underlying storage is HP 2312fc smart array equipped with 12 SAS 15K rpm,
>> configured as RAID10 using 10 HDDs + 2 spares. The array has about 4GB
>> cache. Communication is 4Gbps FC, through HP StorageWorks 8/8 Base e-port
>> SAN Switch. Right now we only have this machine connect to the SAN through
>> switch, but we plan to add more machine to utilize this SAN system.
>>
>> Our application is apache version 1.3.41, mostly serving static HTML file
>> + few PHP. Note that, we have to downgrade to 1.3.41 due to our application
>> requirement. Apache is configured on has 500 MaxClients.
>>
>> The storage OCFS2 are formatted with mkfs.ocfs2 without any special option
>> on. It run directly from multipath'ed SAN storage without LVM or software
>> RAID. We mount OCFS2 with noatime, commit=15, and data=writeback (as well as
>> heartbeat=local). Our cluster.conf is like this
>>
>> cluster:
>>    node_count = 1
>>    name = mycluster
>>
>> node:
>>    ip_port = 7777
>>    ip_address = 203.123.123.123
>>    number = 1
>>    name = mycluster.mydomain.com <http://mycluster.mydomain.com>
>>
>>    cluster = mycluster
>>
>> (NOTE: Some details are neglected here, such as hostname and IP address).
>>
>> Periodically, we found that the file system work very slow. I think that
>> it happened once every few minutes. When the file system slow, httpd process
>> CPU utilization will goes much higher to about 50% or above. I tried to
>> debug this slow by creating a small script that periodically do
>>
>> strace -f dd if=/dev/zero of=/san/testfile bs=1k count=1
>>
>> And time the speed of dd, usually dd will finish within subsecond, but
>> periodically dd will be much slower to about 30-60 seconds. Strace output
>> show this.
>>
>>     0.000026 open("/san/testfile", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1
>>    76.418696 rt_sigaction(SIGUSR1, NULL, {SIG_DFL, [], 0}, 8) = 0
>>
>> So I presume that this mean the open system call is periodically very
>> slow. I did about 5-10 tests which yield similar strace'd results (ranging
>> from just 5-7 seconds to 80 seconds).
>>
>> So my question is, what could be the cause of this slowness? How could I
>> debug this deeper? On which point should we optimize the file system?
>>
>> We are in the process of purchasing and adding more web servers to the
>> system and use reverse proxy to load balance between two servers. We just
>> want to make sure that this will not make situation worst.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20100120/d48ceeb6/attachment.html 


More information about the Ocfs2-users mailing list