[Ocfs2-users] Extremely poor write performance, but read appears to be okay

Mon Dec 13 13:13:35 PST 2010

Sunil,

I'm back in close proximity to my machines. We are still experiencing
extremely odd behavior. I've reduced the cluster down to a single node
and recently rebooted that machine as well. I'm currently observing
highly variable write speeds to the volume. This volume is the only
volume on the disk array and this machine is the only machine
accessing the disk array right now. This is also the only machine
powered of that is connected to the fibre SAN. I'm seeing write speeds
from 33kB/s to 50MB/s. When the rate was really low, it drove the
system to become unresponsive. I've previously been able to obtain
consistant write speeds > 150MB/s.

The initial email said that we were using 1.4.4, however, that was the
tools release. We are using ocfs2 1.4.7. I've also attached the stats
output for the device. I'm not sure how to proceed from here.

$ yum info ocfs2-2.6.18-164.el5.x86_64
Loaded plugins: security
Installed Packages
Name       : ocfs2-2.6.18-164.el5
Arch       : x86_64
Version    : 1.4.7
Release    : 1.el5
Size       : 1.3 M
Repo       : installed
Summary    : The Oracle Cluster Filesystem 2.
URL        : http://oss.oracle.com/projects/ocfs2/
License    : GPL
Description: OCFS2 is the Oracle Cluster Filesystem 2.  This package
is compiled for
           : the default kernel.

$ debugfs.ocfs2 -R "stats"
/dev/disk/by-id/scsi-3600a0b80005bc2ab000016a04bbb0f88-part1
	Revision: 0.90
	Mount Count: 0   Max Mount Count: 20
	State: 0   Errors: 0
	Check Interval: 0   Last Check: Wed Apr  7 15:42:39 2010
	Creator OS: 0
	Feature Compat: 3 backup-super strict-journal-super
	Feature Incompat: 80 sparse inline-data
	Tunefs Incomplete: 0
	Feature RO compat: 1 unwritten
	Root Blknum: 5   System Dir Blknum: 6
	First Cluster Group Blknum: 3
	Block Size Bits: 12   Cluster Size Bits: 12
	Max Node Slots: 20
	Extended Attributes Inline Size: 0
	Label: home
	UUID: 84B7C6421A6C4280AB87F569035C5368
	Hash: 0 (0x0)
	Cluster stack: classic o2cb
	Inode: 2   Mode: 00   Generation: 1165331413 (0x45758bd5)
	FS Generation: 1165331413 (0x45758bd5)
	CRC32: 00000000   ECC: 0000
	Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
	Dynamic Features: (0x0)
	User: 0 (root)   Group: 0 (root)   Size: 0
	Links: 0   Clusters: 1464357111
	ctime: 0x4bbcfc4f -- Wed Apr  7 15:42:39 2010
	atime: 0x0 -- Wed Dec 31 17:00:00 1969
	mtime: 0x4bbcfc4f -- Wed Apr  7 15:42:39 2010
	dtime: 0x0 -- Wed Dec 31 17:00:00 1969
	ctime_nsec: 0x00000000 -- 0
	atime_nsec: 0x00000000 -- 0
	mtime_nsec: 0x00000000 -- 0
	Last Extblk: 0   Orphan Slot: 0
	Sub Alloc Slot: Global   Sub Alloc Bit: 65535

Thanks,
-Daniel

On Fri, Dec 10, 2010 at 4:56 AM, Daniel McDonald <wasade at gmail.com> wrote:
> Thank you. I'll investigate as soon as I can get decent access.
> Rebooting nodes seemed to have temporarily solved the issue, however,
> things are not completely back to normal yet.
> -Daniel
>
> On Thu, Dec 9, 2010 at 8:49 AM, Sunil Mushran <sunil.mushran at oracle.com> wrote:
>> http://oss.oracle.com/git/?p=ocfs2-1.4.git;a=commitdiff;h=1f667766cb67ed05b4d706aa82e8ad0b12eaae8b
>>
>> That specific error has been addressed in the upcoming 1.4.8.
>>
>> Attach the logs and all other info to a bugzilla.
>>
>> On 12/08/2010 05:07 PM, Daniel McDonald wrote:
>>>
>>> Hello,
>>>
>>> I'm writing from the otherside of the world from where my systems are,
>>> so details are coming in slow. We have a 6TB OCFS2 volume across 20 or
>>> so nodes all running OEL5.4 running ocfs2-1.4.4. The system has worked
>>> fairly well for the last 6-8 months. Something has happened over the
>>> last few weeks which has driven write performance nearly to a halt.
>>> I'm not sure how to proceed, and very poor internet is hindering my
>>> abilities further. I've verified that the disk array is in good
>>> health. I'm seeing a few awkward kernel log messages, an example of
>>> one follows. I have not been able to verify all nodes due to limited
>>> time and slow internet in my present location. Any assistance would be
>>> greatly appreciated. I should be able to provide log files in about 12
>>> hours. At this moment, loadavgs on each node are 0.00 to 0.09.
>>>
>>> Here is a test write and associated iostat -xm 5 output. Previously I
>>> was obtaining>  90MB/s:
>>>
>>> $ dd if=/dev/zero of=/home/testdump count=1000 bs=1024k
>>>
>>> ...and associated iostat output:
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>            0.10    0.00    0.43   12.25    0.00   87.22
>>>
>>> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz
>>> avgqu-sz   await  svctm  %util
>>> sda               0.00     1.80  0.00  8.40     0.00     0.04     9.71
>>>     0.01    0.64   0.05   0.04
>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda3              0.00     1.80  0.00  8.40     0.00     0.04     9.71
>>>     0.01    0.64   0.05   0.04
>>> sdc               0.00     0.00 115.80  0.60     0.46     0.00
>>> 8.04     0.99    8.48   8.47  98.54
>>> sdc1              0.00     0.00 115.80  0.60     0.46     0.00
>>> 8.04     0.99    8.48   8.47  98.54
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>            0.07    0.00    0.55   12.25    0.00   87.13
>>>
>>> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz
>>> avgqu-sz   await  svctm  %util
>>> sda               0.00     0.40  0.00  0.80     0.00     0.00    12.00
>>>     0.00    2.00   1.25   0.10
>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda3              0.00     0.40  0.00  0.80     0.00     0.00    12.00
>>>     0.00    2.00   1.25   0.10
>>> sdc               0.00     0.00 112.80  0.40     0.44     0.00
>>> 8.03     0.98    8.68   8.69  98.38
>>> sdc1              0.00     0.00 112.80  0.40     0.44     0.00
>>> 8.03     0.98    8.68   8.69  98.38
>>>
>>> Here is a test read and associated iostat output. I'm intentionally
>>> reading from a different test file as to avoid caching effects:
>>>
>>> $ dd if=/home/someothertestdump of=/dev/null bs=1024k
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>            0.10    0.00    3.60   10.85    0.00   85.45
>>>
>>> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz
>>> avgqu-sz   await  svctm  %util
>>> sda               0.00     3.79  0.00  1.40     0.00     0.02    29.71
>>>     0.00    1.29   0.43   0.06
>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda3              0.00     3.79  0.00  1.40     0.00     0.02    29.71
>>>     0.00    1.29   0.43   0.06
>>> sdc               7.98     0.20 813.17  1.00   102.50     0.00
>>> 257.84     1.92    2.34   1.19  96.71
>>> sdc1              7.98     0.20 813.17  1.00   102.50     0.00
>>> 257.84     1.92    2.34   1.19  96.67
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>            0.07    0.00    3.67   10.22    0.00   86.03
>>>
>>> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz
>>> avgqu-sz   await  svctm  %util
>>> sda               0.00     0.20  0.00  0.40     0.00     0.00    12.00
>>>     0.00    0.50   0.50   0.02
>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>>     0.00    0.00   0.00   0.00
>>> sda3              0.00     0.20  0.00  0.40     0.00     0.00    12.00
>>>     0.00    0.50   0.50   0.02
>>> sdc               6.60     0.20 829.00  1.00   104.28     0.00
>>> 257.32     1.90    2.31   1.17  97.28
>>> sdc1              6.60     0.20 829.00  1.00   104.28     0.00
>>> 257.32     1.90    2.31   1.17  97.28
>>>
>>> I'm seeing a few weird kernel messages, such as:
>>>
>>> Dec  7 14:07:50 growler kernel:
>>> (dlm_wq,4793,4):dlm_deref_lockres_worker:2344 ERROR:
>>> 84B7C6421A6C4280AB87F569035C5368:O0000000000000016296ce900000000: node
>>> 14 trying to drop ref but it is already dropped!
>>> Dec  7 14:07:50 growler kernel: lockres:
>>> O0000000000000016296ce900000000, owner=0, state=0
>>> Dec  7 14:07:50 growler kernel:   last used: 0, refcnt: 6, on purge list:
>>> no
>>> Dec  7 14:07:50 growler kernel:   on dirty list: no, on reco list: no,
>>> migrating pending: no
>>> Dec  7 14:07:50 growler kernel:   inflight locks: 0, asts reserved: 0
>>> Dec  7 14:07:50 growler kernel:   refmap nodes: [ 21 ], inflight=0
>>> Dec  7 14:07:50 growler kernel:   granted queue:
>>> Dec  7 14:07:50 growler kernel:     type=3, conv=-1, node=21,
>>> cookie=21:213370, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n),
>>> pending=(conv=n,lock=n,cancel=n,unlock=n)
>>> Dec  7 14:07:50 growler kernel:   converting queue:
>>> Dec  7 14:07:50 growler kernel:   blocked queue:
>>>
>>>
>>> Here is df output:
>>>
>>> root at growler:~$ df
>>> Filesystem           1K-blocks      Used Available Use% Mounted on
>>> /dev/sda3            245695888  29469416 203544360  13% /
>>> /dev/sda1               101086     15133     80734  16% /boot
>>> tmpfs                 33005580         0  33005580   0% /dev/shm
>>> /dev/sdc1            5857428444 5234400436 623028008  90% /home
>>>
>>>  Thanks
>>> -Daniel
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>
>