[Ocfs2-users] Extremely poor write performance, but read appears to be okay
Daniel McDonald
wasade at gmail.com
Wed Dec 8 17:07:19 PST 2010
Hello,
I'm writing from the otherside of the world from where my systems are,
so details are coming in slow. We have a 6TB OCFS2 volume across 20 or
so nodes all running OEL5.4 running ocfs2-1.4.4. The system has worked
fairly well for the last 6-8 months. Something has happened over the
last few weeks which has driven write performance nearly to a halt.
I'm not sure how to proceed, and very poor internet is hindering my
abilities further. I've verified that the disk array is in good
health. I'm seeing a few awkward kernel log messages, an example of
one follows. I have not been able to verify all nodes due to limited
time and slow internet in my present location. Any assistance would be
greatly appreciated. I should be able to provide log files in about 12
hours. At this moment, loadavgs on each node are 0.00 to 0.09.
Here is a test write and associated iostat -xm 5 output. Previously I
was obtaining > 90MB/s:
$ dd if=/dev/zero of=/home/testdump count=1000 bs=1024k
...and associated iostat output:
avg-cpu: %user %nice %system %iowait %steal %idle
0.10 0.00 0.43 12.25 0.00 87.22
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 1.80 0.00 8.40 0.00 0.04 9.71
0.01 0.64 0.05 0.04
sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda3 0.00 1.80 0.00 8.40 0.00 0.04 9.71
0.01 0.64 0.05 0.04
sdc 0.00 0.00 115.80 0.60 0.46 0.00
8.04 0.99 8.48 8.47 98.54
sdc1 0.00 0.00 115.80 0.60 0.46 0.00
8.04 0.99 8.48 8.47 98.54
avg-cpu: %user %nice %system %iowait %steal %idle
0.07 0.00 0.55 12.25 0.00 87.13
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 0.40 0.00 0.80 0.00 0.00 12.00
0.00 2.00 1.25 0.10
sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda3 0.00 0.40 0.00 0.80 0.00 0.00 12.00
0.00 2.00 1.25 0.10
sdc 0.00 0.00 112.80 0.40 0.44 0.00
8.03 0.98 8.68 8.69 98.38
sdc1 0.00 0.00 112.80 0.40 0.44 0.00
8.03 0.98 8.68 8.69 98.38
Here is a test read and associated iostat output. I'm intentionally
reading from a different test file as to avoid caching effects:
$ dd if=/home/someothertestdump of=/dev/null bs=1024k
avg-cpu: %user %nice %system %iowait %steal %idle
0.10 0.00 3.60 10.85 0.00 85.45
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 3.79 0.00 1.40 0.00 0.02 29.71
0.00 1.29 0.43 0.06
sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda3 0.00 3.79 0.00 1.40 0.00 0.02 29.71
0.00 1.29 0.43 0.06
sdc 7.98 0.20 813.17 1.00 102.50 0.00
257.84 1.92 2.34 1.19 96.71
sdc1 7.98 0.20 813.17 1.00 102.50 0.00
257.84 1.92 2.34 1.19 96.67
avg-cpu: %user %nice %system %iowait %steal %idle
0.07 0.00 3.67 10.22 0.00 86.03
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 0.20 0.00 0.40 0.00 0.00 12.00
0.00 0.50 0.50 0.02
sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sda3 0.00 0.20 0.00 0.40 0.00 0.00 12.00
0.00 0.50 0.50 0.02
sdc 6.60 0.20 829.00 1.00 104.28 0.00
257.32 1.90 2.31 1.17 97.28
sdc1 6.60 0.20 829.00 1.00 104.28 0.00
257.32 1.90 2.31 1.17 97.28
I'm seeing a few weird kernel messages, such as:
Dec 7 14:07:50 growler kernel:
(dlm_wq,4793,4):dlm_deref_lockres_worker:2344 ERROR:
84B7C6421A6C4280AB87F569035C5368:O0000000000000016296ce900000000: node
14 trying to drop ref but it is already dropped!
Dec 7 14:07:50 growler kernel: lockres:
O0000000000000016296ce900000000, owner=0, state=0
Dec 7 14:07:50 growler kernel: last used: 0, refcnt: 6, on purge list: no
Dec 7 14:07:50 growler kernel: on dirty list: no, on reco list: no,
migrating pending: no
Dec 7 14:07:50 growler kernel: inflight locks: 0, asts reserved: 0
Dec 7 14:07:50 growler kernel: refmap nodes: [ 21 ], inflight=0
Dec 7 14:07:50 growler kernel: granted queue:
Dec 7 14:07:50 growler kernel: type=3, conv=-1, node=21,
cookie=21:213370, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n),
pending=(conv=n,lock=n,cancel=n,unlock=n)
Dec 7 14:07:50 growler kernel: converting queue:
Dec 7 14:07:50 growler kernel: blocked queue:
Here is df output:
root at growler:~$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 245695888 29469416 203544360 13% /
/dev/sda1 101086 15133 80734 16% /boot
tmpfs 33005580 0 33005580 0% /dev/shm
/dev/sdc1 5857428444 5234400436 623028008 90% /home
Thanks
-Daniel
More information about the Ocfs2-users
mailing list