[Ocfs2-users] Some questions about ocfs2

Wed Apr 25 13:56:14 PDT 2007

If your average stats are > 500usecs on ocfs2, I am assuming that
is all cold cache. In our tests, the cold cache stat times are
close to that but are the same as ext3 once the inodes are cached.

You could look into reducing the block size to 1K. (Well, it will
require a mkfs - so better you test out the times in your case before
deciding whether to take the plunge or not.)

Things affecting cold cache times are:
1. ocfs2 1.2 does a double inode read during the first read.
Addressed in mainline.
2. dlm lock mastery is expensive (pure overhead compared to local fs).

Ulf Zimmermann wrote:
>> -----Original Message-----
>> From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com]
>> Sent: 04/25/2007 12:16
>> To: Ulf Zimmermann
>> Cc: ocfs2-users at oss.oracle.com
>> Subject: Re: [Ocfs2-users] Some questions about ocfs2
>>
>> What's the blocksize?
>>     
>
> Block Size Bits: 12   Cluster Size Bits: 14
>
>   
>> Ulf Zimmermann wrote:
>>     
>>>> -----Original Message-----
>>>> From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com]
>>>> Sent: 04/25/2007 10:31
>>>> To: Ulf Zimmermann
>>>> Cc: ocfs2-users at oss.oracle.com
>>>> Subject: Re: [Ocfs2-users] Some questions about ocfs2
>>>>
>>>> # debugfs.ocfs2 -R "stats -h" /dev/sdy2 | grep "Cluster Size"
>>>>         Block Size Bits: 12   Cluster Size Bits: 17
>>>> 12 = 4K
>>>> 17 = 128K
>>>>
>>>> Have you tried stracing the process?
>>>> # strace -tt -T  -o /tmp/strace.out  ...
>>>>
>>>>         
>>> Yes, strace shows shows most time is spent in lstat64 (> 99%), where
>>> average execution time on ext3 is < 60 usecs/call while on the ocfs2
>>> volume it is > 500 usecs/call.
>>>
>>>
>>>       
>>>> Ulf Zimmermann wrote:
>>>>
>>>>         
>>>>> Is there a way to see how a file system was formatted, i.e. the
>>>>>
>>>>>           
>>> block
>>>
>>>       
>>>>> size and cluster size? I currently have a 2TB file system, of
>>>>>           
> which
>   
>>>>> about 840GB are in use by around 9 million image files. Average
>>>>>           
> size
>   
>>> of
>>>
>>>       
>>>>> these images is 60-100KB. Currently our production servers still
>>>>>
>>>>>           
>>> have
>>>
>>>       
>>>>> separate file systems on ext3 and we are doing nightly rsync from
>>>>>
>>>>>           
>>> there
>>>
>>>       
>>>>> to this ocfs2 volume. This currently takes ~6 hours, which seems a
>>>>>
>>>>>           
>>> tad
>>>
>>>       
>>>>> slow. The system spends most time during writing files which have
>>>>> changed on the production servers, with high I/O wait.
>>>>>
>>>>> The SAN this ocfs2 volume is on is pretty much idle, I only see up
>>>>>
>>>>>           
>>> to
>>>
>>>       
>>>>> about 20MB/sec traffic and the two nodes which have this volume
>>>>>
>>>>>           
>>> mounted
>>>
>>>       
>>>>> have a private GigE interconnect setup for cluster.conf.
>>>>>
>>>>> Any tips on how to debug where this slowness comes from? Or even
>>>>> suggestion to use another cluster file system for a scenario like
>>>>>
>>>>>           
>>> this.
>>>       
>
> Regards, Ulf.
>
> ---------------------------------------------------------------------
> ATC-Onlane Inc., T: 650-532-6382, F: 650-532-6441
> 4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025
> ---------------------------------------------------------------------
>