[Ocfs2-users] Slow on open()

Tue Feb 2 09:59:47 PST 2010

While it is recommended that the kernels be the same across the
cluster, there is no hard requirement that it should be. The recommendation
is purely from ease-of-management point-of-view.

If both nodes are seeing periodic i/o slowdowns at the same time,
then you should investigate from the switch to the storage array.

Somsak Sriprayoonsakul wrote:
> Another update on the situation.
>
> Yesterday, we have a chance to unplug system from services, so I have 
> tried a little experiment with the system with iozone. The results are 
> interesting.
>
> - The "lock-up" didn't just response to HTTP load. After first-mount, 
> if I ran "iozone -s 1M -r 1 -O" on the OCFS2-mounted path, the lock-up 
> will occur immediately. It will freeze for about a minute. After the 
> freeze, iozone will run quite fast after that. There are slight 
> lock-up (1-2 seconds) when I did "for i in `seq 1 50`; do iozone -s 1M 
> -r 1 -O; done.
>
> - During lock-up, if I run iostat -x 1 | grep sdg (the SAN disk), it 
> appear that the %util is almost 100%, and the service time greatly 
> increase in twice or thrice the amount. The I/O operation per second 
> is only about 150-160 at the time, but %util reach almost 99%.
>
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sdg1              0.00     0.00 161.00  0.00  1288.00     0.00     
> 8.00     0.96    5.99   5.97  96.10
>
>   This is very odd, since the normal operation (servign HTTP load), 
> the result of iostat is much better
>
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz 
> avgqu-sz   await  svctm  %util
> sdg1              4.00   617.00 427.00 25.00 21496.00  5136.00    
> 58.92     2.21    4.90   1.77  80.10
>
>   But I think it now proved one thing. The system is not actually 
> lock-up, it just busy doing I/O to backend storage. Could this mean 
> that there is something wrong with the storage (driver + SAN 
> configuration)?. Note that we are using HP MSA2312fc.
>
> - This happened on both machine connect to SAN. It would happened 
> almost all the time if we did the iozone on both machine at the same 
> time. The lock-up would occur on each machine in turn.
>
> - Another interesting point is that, the amount of time it lock-up is 
> usually exactly the same on both machine. Approximately 145 seconds, 
> counted by looking to the number of lines in "iostat -x 1" output, 
> starting when %util ramped up and finish when it cooled down.
>
> - I tried appending "pci=nomsi", according to bug report on this 
> http://bugzilla.kernel.org/show_bug.cgi?id=11646, but it didn't help 
> at all.
>
> - I just found out that the FC HBA Card we installed on the second 
> machine is operating at the different speed than the first. The 1st 
> run at 4Gb and 2nd run at 2Gb. Anyway this is just temporary card, 
> vendor will install the newer card (presumably the same model) soon. 
> Note that both machine are the same model, Dell PowerEdge 2950 
> DualxQuad Cores Intel Xeon 2.66GHz.
>
> Our kernel and driver information
>
> uname -a output
> Linux host1.mydomain.com <http://host1.mydomain.com> 
> 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009 x86_64 x86_64 
> x86_64 GNU/Linux
>
> Linux host2.mydomain.com <http://host2.mydomain.com> 
> 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST 2010 x86_64 x86_64 
> x86_64 GNU/Linux
>
> (NOTE: the different in kernel version, do we need to run exactly the 
> same version on both machine?)
>
> OCFS version
> ocfs2-tools-1.4.3-1.el5
> ocfs2console-1.4.3-1.el5
> ocfs2-2.6.18-164.6.1.el5-1.4.4-1.el5 <-- 1st machine
> ocfs2-2.6.18-164.11.1.el5-1.4.4-1.el5 <-- 2nd machine
>
> Both using qla2xxx driver.
> filename:       
> /lib/modules/2.6.18-164.6.1.el5/kernel/drivers/scsi/qla2xxx/qla2xxx.ko
> version:        8.03.00.1.05.05-k
> license:        GPL
> description:    QLogic Fibre Channel HBA Driver
> author:         QLogic Corporation
> srcversion:     109451851724AE603A18126
>
>
> 2010/1/29 Somsak Sriprayoonsakul <somsaks at gmail.com 
> <mailto:somsaks at gmail.com>>
>
>     An update for the situation.
>
>     It seems that the lockup situation when we brought in the new
>     server is not quite exactly the same behavior as before.
>
>     Running "iozone -O" on the new machine, which mount OCFS2 without
>     serving the content yet (the load is very low), could leads to
>     OCFS2 lockup for 60-90 seconds. This happened very frequently
>     (running iozone 3 times and this happened once). On the other
>     hand, I run "iozone -O 50" times on the older server without
>     having this lockup. Sometime it slow (few seconds), but not this slow.
>
>     When the lockup occur, both machine would hang. Even simple ls
>     just hang. The funny thing is that, sometime, iozone command is
>     actually not yet running. It just start to spawn the command and
>     lockup occurred, so strace show nothing suspicious. In case it
>     hang in the middle of iozone, the system call the takes long time
>     vary from sync, read, write, and lseek.
>
>     I tried simpler things like repeatedly do "ls" or just "echo hello
>     world > test" but it didn't cause this problem. The command that
>     will cause this lockup almost certainly is just "iozone -O" and it
>     must be called on the second server.
>
>
>     2010/1/28 Somsak Sriprayoonsakul <somsaks at gmail.com
>     <mailto:somsaks at gmail.com>>
>
>         For our case log file shouldn't be the problem since it's
>         store locally. But fragmentation might be the cause. We serve
>         a lot of small files (mostly html and jpg) out of OCFS2. And
>         the file come in and go out very frequently, old topic will be
>         paged out to archive directory out of OCFS2, while new topic
>         are created on OCFS2. Thanks for pointing the bug report then.
>
>         Right now we are using 4 slots and planning to serving about 4
>         in near future. I think we shouldn't decrease the number of
>         slot right now.
>
>         Are there any other statistics that we should observe for?
>
>         2010/1/28 Brad Plant <bplant at iinet.net.au
>         <mailto:bplant at iinet.net.au>>
>
>             Hi Somsak,
>
>             I observed high loads and apache slowness when there was
>             insufficient contiguous free space due to fragmentation. I
>             believe it was because apache couldn't write it's log
>             files efficiently. We had 2 apache nodes and I found that
>             stopping apache on the problem node resolved the problem
>             until I deleted lots of unused files.
>
>             My symptoms don't seem to suggest a slow open() syscall
>             like your strace results are showing, but I certainly got
>             the high load and poor apache performance. It might be
>             worth checking out anyway. There is a bug report and we're
>             just waiting for the patch to get reviewed and made
>             publicly available.
>
>             http://oss.oracle.com/bugzilla/show_bug.cgi?id=1189
>
>             Cheers,
>
>             Brad
>
>
>
>             On Tue, 19 Jan 2010 16:12:07 +0700
>             Somsak Sriprayoonsakul <somsaks at gmail.com
>             <mailto:somsaks at gmail.com>> wrote:
>
>             > Hello,
>             >
>             > We are using OCFS2 version 1.4.3 on CentOS5, x86_64 with
>             8GB memory. The
>             > underlying storage is HP 2312fc smart array equipped
>             with 12 SAS 15K rpm,
>             > configured as RAID10 using 10 HDDs + 2 spares. The array
>             has about 4GB
>             > cache. Communication is 4Gbps FC, through HP
>             StorageWorks 8/8 Base e-port
>             > SAN Switch. Right now we only have this machine connect
>             to the SAN through
>             > switch, but we plan to add more machine to utilize this
>             SAN system.
>             >
>             > Our application is apache version 1.3.41, mostly serving
>             static HTML file +
>             > few PHP. Note that, we have to downgrade to 1.3.41 due
>             to our application
>             > requirement. Apache is configured on has 500 MaxClients.
>             >
>             > The storage OCFS2 are formatted with mkfs.ocfs2 without
>             any special option
>             > on. It run directly from multipath'ed SAN storage
>             without LVM or software
>             > RAID. We mount OCFS2 with noatime, commit=15, and
>             data=writeback (as well as
>             > heartbeat=local). Our cluster.conf is like this
>             >
>             > cluster:
>             >     node_count = 1
>             >     name = mycluster
>             >
>             > node:
>             >     ip_port = 7777
>             >     ip_address = 203.123.123.123
>             >     number = 1
>             >     name = mycluster.mydomain.com
>             <http://mycluster.mydomain.com>
>             >     cluster = mycluster
>             >
>             > (NOTE: Some details are neglected here, such as hostname
>             and IP address).
>             >
>             > Periodically, we found that the file system work very
>             slow. I think that it
>             > happened once every few minutes. When the file system
>             slow, httpd process
>             > CPU utilization will goes much higher to about 50% or
>             above. I tried to
>             > debug this slow by creating a small script that
>             periodically do
>             >
>             > strace -f dd if=/dev/zero of=/san/testfile bs=1k count=1
>             >
>             > And time the speed of dd, usually dd will finish within
>             subsecond, but
>             > periodically dd will be much slower to about 30-60
>             seconds. Strace output
>             > show this.
>             >
>             >      0.000026 open("/san/testfile",
>             O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1
>             >     76.418696 rt_sigaction(SIGUSR1, NULL, {SIG_DFL, [],
>             0}, 8) = 0
>             >
>             > So I presume that this mean the open system call is
>             periodically very slow.
>             > I did about 5-10 tests which yield similar strace'd
>             results (ranging from
>             > just 5-7 seconds to 80 seconds).
>             >
>             > So my question is, what could be the cause of this
>             slowness? How could I
>             > debug this deeper? On which point should we optimize the
>             file system?
>             >
>             > We are in the process of purchasing and adding more web
>             servers to the
>             > system and use reverse proxy to load balance between two
>             servers. We just
>             > want to make sure that this will not make situation worst.
>
>             _______________________________________________
>             Ocfs2-users mailing list
>             Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>             http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users