[Ocfs2-users] 10 Node OCFS2 Cluster - Performance
Sunil Mushran
sunil.mushran at oracle.com
Tue Sep 15 09:53:37 PDT 2009
All clusters are running release tests. So not at the moment.
But you can see if your hardware is limiting you.
# time dd if=/dev/sdX1 of=/dev/null bs=1M count=1000 skip=2000
Run this on one node, then two nodes concurrently, 5 nodes, 10 nodes.
The idea is to see whether you see any drop off in read performance
when multiple nodes are hitting the iscsi io stack.
# echo 3 > /proc/sys/vm/drop_caches
Do remember to clear the caches between runs.
Sunil
Laurence Mayer wrote:
> Hi Sunil
> I am running iostat on only one of the nodes, so the results you see
> is only from a single node.
> However I am running this concurrently on the 10 nodes, resulting in a
> total of 2Gig being written, so yes on this node
> it took 8 secs to write 205Megs.
>
> My latest results (using sync after the dd) show that when running on
> the 10 nodes concurrently it take 37secs
> to write the 10 x 205Meg files (2Gig),
> Here are the results from ALL the nodes:
> run.sh.e7212.1:204800000 bytes (205 MB) copied, 17.9657 s, 11.4 MB/s
> run.sh.e7212.10:204800000 bytes (205 MB) copied, 30.1489 s, 6.8 MB/s
> run.sh.e7212.2:204800000 bytes (205 MB) copied, 16.4605 s, 12.4 MB/s
> run.sh.e7212.3:204800000 bytes (205 MB) copied, 18.1461 s, 11.3 MB/s
> run.sh.e7212.4:204800000 bytes (205 MB) copied, 20.9716 s, 9.8 MB/s
> run.sh.e7212.5:204800000 bytes (205 MB) copied, 22.6265 s, 9.1 MB/s
> run.sh.e7212.6:204800000 bytes (205 MB) copied, 12.9318 s, 15.8 MB/s
> run.sh.e7212.7:204800000 bytes (205 MB) copied, 15.1739 s, 13.5 MB/s
> run.sh.e7212.8:204800000 bytes (205 MB) copied, 13.8953 s, 14.7 MB/s
> run.sh.e7212.9:204800000 bytes (205 MB) copied, 29.5445 s, 6.9 MB/s
>
> real 0m37.920s
> user 0m0.000s
> sys 0m0.030s
>
> (This averages 11.17MB/sec per node, which seems very low.)
>
> compared to 23.5secs when writing 2Gig from a single node.
>
> root at n2:# time (dd if=/dev/zero of=txt bs=2048000 count=1000; sync)
> 1000+0 records in
> 1000+0 records out
> 2048000000 bytes (2.0 GB) copied, 16.1369 s, 127 MB/s
>
> real 0m23.495s
> user 0m0.000s
> sys 0m15.180s
>
>
> Sunil, do you have any way to run the same test (10 x 200Megs)
> concurrently on 10 or more nodes to compare results?
>
> Thanks again
>
> Laurence
>
>
> Sunil Mushran wrote:
>> Always cc ocfs2-users.
>>
>> Strange. The ocfs2 numbers look incomplete. It shows only 200MB written.
>> You said it was taking 16 secs. Yet the iostat numbers are for 8 secs
>> only.
>>
>> The xfs numbers look complete. Shows 90+ MB/s.
>>
>> On my iscsi setup (netapp backend, gige, node with single cpu box and
>> 512M RAM), I get 85MB/s.
>>
>> # time (dd if=/dev/zero of=/mnt/boq7 count=2000 bs=1M ; sync ;)
>> sync
>> 2000+0 records in
>> 2000+0 records out
>> 2097152000 bytes (2.1 GB) copied, 24.4168 seconds, 85.9 MB/s
>>
>> real 0m24.515s
>> user 0m0.035s
>> sys 0m14.967s
>>
>> This is with data=writeback.
>>
>> The 2.2 secs is probably because of delayed allocation. Since your
>> box has
>> enough memory, xfs can cache all the writes and return to the user. Its
>> writeback then flushes the data in the background. The iostat/vmstat
>> numbers should show similar writeback numbers.
>>
>> Sunil
>>
>> Laurence Mayer wrote:
>>>
>>> iostat from cfs volume
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 1.77 2.28 0.00 95.95
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 4.00 2.00 4.00 16.00 64.00
>>> 13.33 0.12 15.00 15.00 9.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 6.90 7.14 0.00 85.96
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 16.00 9.00 40.00 75.00 441.00
>>> 10.53 0.43 9.39 6.73 33.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 7.67 7.18 0.00 85.15
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 20.00 11.00 47.00 88.00 536.00
>>> 10.76 0.36 6.21 4.48 26.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 5.65 10.07 0.00 84.28
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 16.00 9.00 37.00 75.00 417.00
>>> 10.70 0.55 11.96 8.48 39.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.25 0.00 12.69 31.22 0.00 55.84
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 40324.00 2.00 181.00 16.00 174648.00
>>> 954.45 94.58 364.86 4.81 88.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 13.35 14.14 0.00 72.51
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 9281.00 1.00 228.00 11.00 224441.00
>>> 980.14 100.93 559.17 4.37 100.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 0.25 0.50 0.00 99.25
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 0.00 0.00 3.00 0.00 1040.00
>>> 346.67 0.03 240.00 6.67 2.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 0.00 0.00 0.00 100.00
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 0.00 1.00 1.00 11.00
>>> 1.00 6.00 0.04 20.00 20.00 4.00
>>>
>>> vmstat from cfs volume:
>>> procs -----------memory---------- ---swap-- -----io---- -system--
>>> ----cpu----
>>> r b swpd free buff cache si so bi bo in cs us
>>> sy id wa
>>> 0 0 0 447656 279416 15254408 0 0 0 0 39 350
>>> 0 0 100 0
>>> 0 0 0 447656 279416 15254408 0 0 5 21 61 358
>>> 0 0 100 0
>>> 0 0 0 447656 279416 15254408 0 0 0 0 49 369
>>> 0 0 100 0
>>> 0 0 0 447656 279416 15254408 0 0 6 0 28 318
>>> 0 0 100 0
>>> 0 0 0 447656 279416 15254408 0 0 0 0 26 321
>>> 0 0 100 0
>>> 0 0 0 447656 279416 15254408 0 0 5 1 45 339
>>> 0 0 100 0
>>> 0 0 0 447656 279416 15254412 0 0 0 0 8 283
>>> 0 0 100 0
>>> 0 1 0 439472 279424 15262604 0 0 14 80 93 379
>>> 0 1 90 9
>>> 0 0 0 439472 279424 15262604 0 0 0 4 43 338
>>> 0 0 97 2
>>> 0 0 0 382312 279456 15319964 0 0 37 209 208 562
>>> 0 7 85 8
>>> 0 0 0 324524 279500 15377292 0 0 44 264 250 647
>>> 0 7 86 7
>>> 0 0 0 266864 279532 15434636 0 0 38 208 213 548
>>> 0 7 83 10
>>> 0 3 0 250072 279544 15450584 0 0 44 124832 13558
>>> 2038 0 11 62 27
>>> 0 1 0 250948 279564 15450584 0 0 5 75341 19596
>>> 2735 0 13 71 16
>>> 0 0 0 252808 279564 15450548 0 0 0 52 2777 849
>>> 0 2 95 3
>>> 0 0 0 252808 279564 15450548 0 0 6 0 21 310
>>> 0 0 100 0
>>> 0 0 0 252808 279564 15450548 0 0 0 0 15 298
>>> 0 0 100 0
>>> 0 0 0 253012 279564 15450548 0 0 5 1 29 310
>>> 0 0 100 0
>>> 0 0 0 253048 279564 15450552 0 0 0 0 19 290
>>> 0 0 100 0
>>> 0 0 0 253048 279564 15450552 0 0 6 0 26 305
>>> 0 0 100 0
>>> 1 0 0 253172 279564 15450552 0 0 0 60 28 326
>>> 0 0 100 0
>>>
>>>
>>> xfs volume:
>>> iostat
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 0.00 4.00 0.00 40.00 0.00
>>> 10.00 0.05 12.00 12.00 4.80
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 14.98 0.25 0.00 84.77
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 0.00 3.00 5.00 24.00 3088.00
>>> 389.00 6.54 44.00 17.00 13.60
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 10.67 21.86 0.00 67.47
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 1.00 0.00 221.00 0.00 202936.00
>>> 918.26 110.51 398.39 4.52 100.00
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 4.92 21.84 0.00 73.23
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 2.00 0.00 232.00 0.00 209152.00
>>> 901.52 110.67 493.50 4.31 100.00
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 3.67 22.78 0.00 73.54
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 1.00 0.00 215.00 0.00 185717.00
>>> 863.80 111.37 501.67 4.65 100.00
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.12 0.00 6.24 12.61 0.00 81.02
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 1.00 0.00 200.00 0.00 178456.00
>>> 892.28 80.01 541.82 4.88 97.60
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.12 0.00 4.61 8.34 0.00 86.92
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 0.00 0.00 179.00 0.00 183296.00
>>> 1024.00 134.56 470.61 5.21 93.20
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 4.25 9.96 0.00 85.79
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sdd 0.00 0.00 0.00 201.00 0.00 205824.00
>>> 1024.00 142.86 703.92 4.98 100.00
>>>
>>>
>>>
>>> vmstat
>>> procs -----------memory---------- ---swap-- -----io---- -system--
>>> ----cpu----
>>> r b swpd free buff cache si so bi bo in cs us
>>> sy id wa
>>> 1 0 45396 214592 6332 31771312 0 0 668 908 3 6
>>> 3 2 92 3
>>> 0 0 45396 214460 6332 31771336 0 0 0 0 14 4874
>>> 0 0 100 0
>>> 2 0 45396 161032 6324 31822524 0 0 20 0 42 6074
>>> 0 13 87 0
>>> 5 1 45396 166380 6324 31820072 0 0 12 77948 8166 6416
>>> 0 16 77 7
>>> 1 2 45396 163176 6324 31824580 0 0 28 102920 24190
>>> 6660 0 6 73 21
>>> 0 2 45396 163096 6332 31824580 0 0 0 102743 22576
>>> 6700 0 5 72 23
>>> 0 2 45396 163076 6332 31824580 0 0 0 90400 21831
>>> 6500 0 4 76 21
>>> 0 1 45396 163012 6332 31824580 0 0 0 114732 19686
>>> 5894 0 7 83 10
>>> 0 1 45396 162972 6332 31824580 0 0 0 98304 24882
>>> 6314 0 4 87 8
>>> 0 1 45396 163064 6332 31824580 0 0 0 98304 24118
>>> 6285 0 4 84 12
>>> 0 1 45396 163096 6340 31824576 0 0 0 114720 24800
>>> 6166 0 4 87 9
>>> 0 1 45396 162964 6340 31824584 0 0 0 98304 24829
>>> 6105 0 3 85 12
>>> 0 1 45396 162856 6340 31824584 0 0 0 98304 23506
>>> 6402 0 5 83 12
>>> 0 1 45396 162888 6340 31824584 0 0 0 114688 24685
>>> 7057 0 4 87 9
>>> 0 1 45396 162600 6340 31824584 0 0 0 98304 24902
>>> 7107 0 4 86 10
>>> 0 1 45396 162740 6340 31824584 0 0 0 98304 24906
>>> 7019 0 4 91 6
>>> 0 1 45396 162616 6348 31824584 0 0 0 114728 24997
>>> 7169 0 4 86 9
>>> 0 1 45396 162896 6348 31824584 0 0 0 98304 23700
>>> 6857 0 4 85 11
>>> 0 1 45396 162732 6348 31824584 0 0 0 94512 24468
>>> 6995 0 3 89 8
>>> 0 1 45396 162836 6348 31824584 0 0 0 81920 19764
>>> 6604 0 7 81 11
>>> 0 3 45396 162996 6348 31824584 0 0 0 114691 24303
>>> 7270 0 4 81 14
>>> procs -----------memory---------- ---swap-- -----io---- -system--
>>> ----cpu----
>>> r b swpd free buff cache si so bi bo in cs us
>>> sy id wa
>>> 0 1 45396 163160 6356 31824584 0 0 0 98332 22695
>>> 7174 0 4 78 18
>>> 0 1 45396 162848 6356 31824584 0 0 0 90549 24836
>>> 7347 0 4 82 15
>>> 1 0 45396 163092 6364 31824580 0 0 0 37 13990
>>> 6216 0 6 83 11
>>> 0 0 45396 163272 6364 31824588 0 0 0 320 65 3817
>>> 0 0 100 0
>>> 0 0 45396 163272 6364 31824588 0 0 0 0 8 3694
>>> 0 0 100 0
>>> 0 0 45396 163272 6364 31824588 0 0 0 0 25 3833
>>> 0 0 100 0
>>> 0 0 45396 163272 6364 31824588 0 0 0 1 13 3690
>>> 0 0 100 0
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Sep 14, 2009 at 10:15 PM, Sunil Mushran
>>> <sunil.mushran at oracle.com <mailto:sunil.mushran at oracle.com>> wrote:
>>>
>>> Add a sync. Both utils are showing very little io. And do the same
>>> for runs on both ocfs2 and xfs.
>>>
>>> # dd if... ; sync;
>>>
>>> Laurence Mayer wrote:
>>>
>>> Here is the output of iostat while running the test on all the
>>> OCFS volume.
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.23 0.00 15.80 0.45 0.00 83.52
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s
>>> wsec/s avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 4.00 5.00 4.00 43.00
>>> 57.00 11.11 0.08 8.89 8.89 8.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.28 0.00 4.46 0.00 0.00 95.26
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s
>>> wsec/s avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 0.00 0.00 0.00 0.00
>>> 0.00 0.00 0.00 0.00 0.00 0.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.25 0.00 0.25 3.23 0.00 96.28
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s
>>> wsec/s avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 7.00 1.00 13.00 11.00
>>> 153.00 11.71 0.24 17.14 11.43 16.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 0.00 0.00 0.00 100.00
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s
>>> wsec/s avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 0.00 0.00 0.00 0.00
>>> 0.00 0.00 0.00 0.00 0.00 0.00
>>>
>>> avg-cpu: %user %nice %system %iowait %steal %idle
>>> 0.00 0.00 0.00 0.00 0.00 100.00
>>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s
>>> wsec/s avgrq-sz avgqu-sz await svctm %util
>>> sdc 0.00 0.00 1.00 1.00 11.00
>>> 1.00 6.00 0.03 15.00 15.00 3.00
>>>
>>> vmstat:
>>> procs -----------memory---------- ---swap-- -----io----
>>> -system-- ----cpu----
>>> r b swpd free buff cache si so bi bo
>>> in cs us sy id wa
>>> 0 0 0 54400 279320 15651312 0 0 9 8 2
>>> 4 30 1 69 0
>>> 0 0 0 54384 279320 15651316 0 0 6 0 24
>>> 299 0 0 100 0
>>> 0 0 0 54384 279320 15651316 0 0 0 0 92
>>> 409 0 0 100 0
>>> 2 0 0 54384 279320 15651316 0 0 5 1 81
>>> 386 0 0 100 0
>>> 0 0 0 53756 279320 15651352 0 0 8 0 730
>>> 1664 0 1 99 0
>>> 0 0 0 53232 279320 15651352 0 0 6 88 586
>>> 1480 0 0 99 0
>>> 0 0 0 242848 279320 15458608 0 0 8 0 348
>>> 1149 0 3 97 0
>>> 0 0 0 242868 279320 15458608 0 0 5 1 220
>>> 721 0 0 100 0
>>> 0 0 0 242868 279320 15458608 0 0 0 0 201
>>> 709 0 0 100 0
>>> 0 0 0 243116 279320 15458608 0 0 6 0 239
>>> 775 0 0 100 0
>>> 0 0 0 243116 279320 15458608 0 0 0 0 184
>>> 676 0 0 100 0
>>> 0 0 0 243116 279336 15458608 0 0 5 65 236
>>> 756 0 0 99 0
>>> 0 0 0 243488 279336 15458608 0 0 0 0 231
>>> 791 0 0 100 0
>>> 1 0 0 243488 279336 15458608 0 0 6 0 193
>>> 697 0 1 100 0
>>> 0 0 0 243488 279336 15458608 0 0 0 0 221
>>> 762 0 0 100 0
>>> 0 0 0 243860 279336 15458608 0 0 9 1 240
>>> 793 0 0 100 0
>>> 0 0 0 243860 279336 15458608 0 0 0 0 197
>>> 708 0 0 100 0
>>> 1 0 0 117384 279348 15585384 0 0 26 16 124
>>> 524 0 15 84 1
>>> 0 0 0 53204 279356 15651364 0 0 0 112 141
>>> 432 0 8 91 1
>>> 0 0 0 53212 279356 15651320 0 0 5 1 79
>>> 388 0 0 100 0
>>> 0 0 0 53212 279356 15651320 0 0 0 20 30
>>> 301 0 0 100 0
>>> Does this give you any clue to the bottle neck?
>>> On Mon, Sep 14, 2009 at 9:42 PM, Sunil Mushran
>>> <sunil.mushran at oracle.com <mailto:sunil.mushran at oracle.com>
>>> <mailto:sunil.mushran at oracle.com
>>> <mailto:sunil.mushran at oracle.com>>> wrote:
>>>
>>> Get some iostat/vmstat numbers.
>>> # iostat -x /dev/sdX 1
>>> # vmstat 1
>>>
>>> How much memory do the nodes have? If more than 2G, XFS
>>> is probably leveraging its delayed allocation feature to
>>> heavily
>>> cache the writes. iostat/vmstat should show that.
>>>
>>> Is the timing for the 10 node test cumulative?
>>>
>>> Laurence Mayer wrote:
>>>
>>> Hi,
>>>
>>> I am currently running a 10 Node OCFS2 Cluster (version
>>> 1.3.9-0ubuntu1) on Ubuntu Server 8.04 x86_64.
>>> Linux n1 2.6.24-24-server #1 SMP Tue Jul 7 19:39:36 UTC
>>> 2009
>>> x86_64 GNU/Linux
>>>
>>> The Cluster is connected to a 1Tera iSCSI Device
>>> presented by
>>> an IBM 3300 Storage System, running over a 1Gig Network.
>>> Mounted on all nodes: /dev/sdc1 on /cfs1 type ocfs2
>>> (rw,_netdev,data=writeback,heartbeat=local)
>>> Maximum Nodes: 32
>>> Block Size=4k
>>> Cluster Size=4k
>>>
>>> My testing shows that to write simultaneously from
>>> the 10
>>> nodes, 10 x 200Meg files (1 file per node, total of
>>> 2Gig)
>>> takes ~23.54secs.
>>> Reading the files back can take just as long.
>>>
>>> Do these numbers sound correct?
>>>
>>> Doing dd if=/dev/zero of=/cfs1/xxxxx/txt count=1000
>>> bs=2048000
>>> (2Gig) from a single node takes 16secs.
>>>
>>> (running the same dd command on an XFS filesystem
>>> connected to
>>> the same iSCSI Storage takes 2.2secs)
>>>
>>> Is there any tips & tricks to improve performance on
>>> OCFS2?
>>>
>>> Thanks in advance
>>> Laurence
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> <mailto:Ocfs2-users at oss.oracle.com>
>>> <mailto:Ocfs2-users at oss.oracle.com
>>> <mailto:Ocfs2-users at oss.oracle.com>>
>>>
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>>
>>>
>>
More information about the Ocfs2-users
mailing list