[Ocfs2-users] 10 Node OCFS2 Cluster - Performance

Tue Sep 15 00:24:03 PDT 2009

Hi Sunil
I am running iostat on only one of the nodes, so the results you see is 
only from a single node.
However I am running this concurrently on the 10 nodes, resulting in a 
total of 2Gig being written, so yes on this node
it took 8 secs to write 205Megs.

My latest results (using sync after the dd) show that when running on 
the 10 nodes concurrently it take 37secs
to write the 10 x 205Meg files (2Gig),
Here are the results from ALL the nodes:
run.sh.e7212.1:204800000 bytes (205 MB) copied, 17.9657 s, 11.4 MB/s
run.sh.e7212.10:204800000 bytes (205 MB) copied, 30.1489 s, 6.8 MB/s
run.sh.e7212.2:204800000 bytes (205 MB) copied, 16.4605 s, 12.4 MB/s
run.sh.e7212.3:204800000 bytes (205 MB) copied, 18.1461 s, 11.3 MB/s
run.sh.e7212.4:204800000 bytes (205 MB) copied, 20.9716 s, 9.8 MB/s
run.sh.e7212.5:204800000 bytes (205 MB) copied, 22.6265 s, 9.1 MB/s
run.sh.e7212.6:204800000 bytes (205 MB) copied, 12.9318 s, 15.8 MB/s
run.sh.e7212.7:204800000 bytes (205 MB) copied, 15.1739 s, 13.5 MB/s
run.sh.e7212.8:204800000 bytes (205 MB) copied, 13.8953 s, 14.7 MB/s
run.sh.e7212.9:204800000 bytes (205 MB) copied, 29.5445 s, 6.9 MB/s

real    0m37.920s
user    0m0.000s
sys     0m0.030s

(This averages 11.17MB/sec per node, which seems very low.)

compared to 23.5secs when writing 2Gig from a single node.

root at n2:# time (dd if=/dev/zero of=txt bs=2048000 count=1000; sync)
1000+0 records in
1000+0 records out
2048000000 bytes (2.0 GB) copied, 16.1369 s, 127 MB/s

real    0m23.495s
user    0m0.000s
sys     0m15.180s

Sunil, do you have any way to run the same test (10 x 200Megs) 
concurrently on 10 or more nodes to compare results?

Thanks again

Laurence

Sunil Mushran wrote:
> Always cc ocfs2-users.
>
> Strange. The ocfs2 numbers look incomplete. It shows only 200MB written.
> You said it was taking 16 secs. Yet the iostat numbers are for 8 secs 
> only.
>
> The xfs numbers look complete. Shows 90+ MB/s.
>
> On my iscsi setup (netapp backend, gige, node with single cpu box and
> 512M RAM), I get 85MB/s.
>
> # time (dd if=/dev/zero of=/mnt/boq7 count=2000 bs=1M ; sync ;)
> sync
> 2000+0 records in
> 2000+0 records out
> 2097152000 bytes (2.1 GB) copied, 24.4168 seconds, 85.9 MB/s
>
> real   0m24.515s
> user   0m0.035s
> sys    0m14.967s
>
> This is with data=writeback.
>
> The 2.2 secs is probably because of delayed allocation. Since your box 
> has
> enough memory, xfs can cache all the writes and return to the user. Its
> writeback then flushes the data in the background. The iostat/vmstat
> numbers should show similar writeback numbers.
>
> Sunil
>
> Laurence Mayer wrote:
>>  
>> iostat from cfs volume
>>  
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    1.77    2.28    0.00   95.95
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00     4.00    2.00    4.00    16.00    64.00    
>> 13.33     0.12   15.00  15.00   9.00
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    6.90    7.14    0.00   85.96
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00    16.00    9.00   40.00    75.00   441.00    
>> 10.53     0.43    9.39   6.73  33.00
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    7.67    7.18    0.00   85.15
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00    20.00   11.00   47.00    88.00   536.00    
>> 10.76     0.36    6.21   4.48  26.00
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    5.65   10.07    0.00   84.28
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00    16.00    9.00   37.00    75.00   417.00    
>> 10.70     0.55   11.96   8.48  39.00
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.25    0.00   12.69   31.22    0.00   55.84
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00 40324.00    2.00  181.00    16.00 174648.00   
>> 954.45    94.58  364.86   4.81  88.00
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00   13.35   14.14    0.00   72.51
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00  9281.00    1.00  228.00    11.00 224441.00   
>> 980.14   100.93  559.17   4.37 100.00
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    0.25    0.50    0.00   99.25
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00     0.00    0.00    3.00     0.00  1040.00   
>> 346.67     0.03  240.00   6.67   2.00
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    0.00    0.00    0.00  100.00
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdc               0.00     0.00    1.00    1.00    11.00     1.00     
>> 6.00     0.04   20.00  20.00   4.00
>>
>> vmstat from cfs volume:
>> procs -----------memory---------- ---swap-- -----io---- -system-- 
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us 
>> sy id wa
>>  0  0      0 447656 279416 15254408    0    0     0     0   39  350  
>> 0  0 100  0
>>  0  0      0 447656 279416 15254408    0    0     5    21   61  358  
>> 0  0 100  0
>>  0  0      0 447656 279416 15254408    0    0     0     0   49  369  
>> 0  0 100  0
>>  0  0      0 447656 279416 15254408    0    0     6     0   28  318  
>> 0  0 100  0
>>  0  0      0 447656 279416 15254408    0    0     0     0   26  321  
>> 0  0 100  0
>>  0  0      0 447656 279416 15254408    0    0     5     1   45  339  
>> 0  0 100  0
>>  0  0      0 447656 279416 15254412    0    0     0     0    8  283  
>> 0  0 100  0
>>  0  1      0 439472 279424 15262604    0    0    14    80   93  379  
>> 0  1 90  9
>>  0  0      0 439472 279424 15262604    0    0     0     4   43  338  
>> 0  0 97  2
>>  0  0      0 382312 279456 15319964    0    0    37   209  208  562  
>> 0  7 85  8
>>  0  0      0 324524 279500 15377292    0    0    44   264  250  647  
>> 0  7 86  7
>>  0  0      0 266864 279532 15434636    0    0    38   208  213  548  
>> 0  7 83 10
>>  0  3      0 250072 279544 15450584    0    0    44 124832 13558 
>> 2038  0 11 62 27
>>  0  1      0 250948 279564 15450584    0    0     5 75341 19596 2735  
>> 0 13 71 16
>>  0  0      0 252808 279564 15450548    0    0     0    52 2777  849  
>> 0  2 95  3
>>  0  0      0 252808 279564 15450548    0    0     6     0   21  310  
>> 0  0 100  0
>>  0  0      0 252808 279564 15450548    0    0     0     0   15  298  
>> 0  0 100  0
>>  0  0      0 253012 279564 15450548    0    0     5     1   29  310  
>> 0  0 100  0
>>  0  0      0 253048 279564 15450552    0    0     0     0   19  290  
>> 0  0 100  0
>>  0  0      0 253048 279564 15450552    0    0     6     0   26  305  
>> 0  0 100  0
>>  1  0      0 253172 279564 15450552    0    0     0    60   28  326  
>> 0  0 100  0
>>  
>>  
>> xfs volume:
>> iostat
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     0.00    4.00    0.00    40.00     0.00    
>> 10.00     0.05   12.00  12.00   4.80
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00   14.98    0.25    0.00   84.77
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     0.00    3.00    5.00    24.00  3088.00   
>> 389.00     6.54   44.00  17.00  13.60
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00   10.67   21.86    0.00   67.47
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     1.00    0.00  221.00     0.00 202936.00   
>> 918.26   110.51  398.39   4.52 100.00
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    4.92   21.84    0.00   73.23
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     2.00    0.00  232.00     0.00 209152.00   
>> 901.52   110.67  493.50   4.31 100.00
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    3.67   22.78    0.00   73.54
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     1.00    0.00  215.00     0.00 185717.00   
>> 863.80   111.37  501.67   4.65 100.00
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.12    0.00    6.24   12.61    0.00   81.02
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     1.00    0.00  200.00     0.00 178456.00   
>> 892.28    80.01  541.82   4.88  97.60
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.12    0.00    4.61    8.34    0.00   86.92
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     0.00    0.00  179.00     0.00 183296.00  
>> 1024.00   134.56  470.61   5.21  93.20
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    4.25    9.96    0.00   85.79
>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
>> avgrq-sz avgqu-sz   await  svctm  %util
>> sdd               0.00     0.00    0.00  201.00     0.00 205824.00  
>> 1024.00   142.86  703.92   4.98 100.00
>>  
>>  
>>  
>> vmstat
>> procs -----------memory---------- ---swap-- -----io---- -system-- 
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us 
>> sy id wa
>>  1  0  45396 214592   6332 31771312    0    0   668   908    3    6  
>> 3  2 92  3
>>  0  0  45396 214460   6332 31771336    0    0     0     0   14 4874  
>> 0  0 100  0
>>  2  0  45396 161032   6324 31822524    0    0    20     0   42 6074  
>> 0 13 87  0
>>  5  1  45396 166380   6324 31820072    0    0    12 77948 8166 6416  
>> 0 16 77  7
>>  1  2  45396 163176   6324 31824580    0    0    28 102920 24190 
>> 6660  0  6 73 21
>>  0  2  45396 163096   6332 31824580    0    0     0 102743 22576 
>> 6700  0  5 72 23
>>  0  2  45396 163076   6332 31824580    0    0     0 90400 21831 6500  
>> 0  4 76 21
>>  0  1  45396 163012   6332 31824580    0    0     0 114732 19686 
>> 5894  0  7 83 10
>>  0  1  45396 162972   6332 31824580    0    0     0 98304 24882 6314  
>> 0  4 87  8
>>  0  1  45396 163064   6332 31824580    0    0     0 98304 24118 6285  
>> 0  4 84 12
>>  0  1  45396 163096   6340 31824576    0    0     0 114720 24800 
>> 6166  0  4 87  9
>>  0  1  45396 162964   6340 31824584    0    0     0 98304 24829 6105  
>> 0  3 85 12
>>  0  1  45396 162856   6340 31824584    0    0     0 98304 23506 6402  
>> 0  5 83 12
>>  0  1  45396 162888   6340 31824584    0    0     0 114688 24685 
>> 7057  0  4 87  9
>>  0  1  45396 162600   6340 31824584    0    0     0 98304 24902 7107  
>> 0  4 86 10
>>  0  1  45396 162740   6340 31824584    0    0     0 98304 24906 7019  
>> 0  4 91  6
>>  0  1  45396 162616   6348 31824584    0    0     0 114728 24997 
>> 7169  0  4 86  9
>>  0  1  45396 162896   6348 31824584    0    0     0 98304 23700 6857  
>> 0  4 85 11
>>  0  1  45396 162732   6348 31824584    0    0     0 94512 24468 6995  
>> 0  3 89  8
>>  0  1  45396 162836   6348 31824584    0    0     0 81920 19764 6604  
>> 0  7 81 11
>>  0  3  45396 162996   6348 31824584    0    0     0 114691 24303 
>> 7270  0  4 81 14
>> procs -----------memory---------- ---swap-- -----io---- -system-- 
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us 
>> sy id wa
>>  0  1  45396 163160   6356 31824584    0    0     0 98332 22695 7174  
>> 0  4 78 18
>>  0  1  45396 162848   6356 31824584    0    0     0 90549 24836 7347  
>> 0  4 82 15
>>  1  0  45396 163092   6364 31824580    0    0     0    37 13990 6216  
>> 0  6 83 11
>>  0  0  45396 163272   6364 31824588    0    0     0   320   65 3817  
>> 0  0 100  0
>>  0  0  45396 163272   6364 31824588    0    0     0     0    8 3694  
>> 0  0 100  0
>>  0  0  45396 163272   6364 31824588    0    0     0     0   25 3833  
>> 0  0 100  0
>>  0  0  45396 163272   6364 31824588    0    0     0     1   13 3690  
>> 0  0 100  0
>>  
>>  
>>  
>>
>>  
>> On Mon, Sep 14, 2009 at 10:15 PM, Sunil Mushran 
>> <sunil.mushran at oracle.com <mailto:sunil.mushran at oracle.com>> wrote:
>>
>>     Add a sync. Both utils are showing very little io. And do the same
>>     for runs on both ocfs2 and xfs.
>>
>>     # dd if...  ; sync;
>>
>>     Laurence Mayer wrote:
>>
>>         Here is the output of iostat while running the test on all the
>>         OCFS volume.
>>          avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>                   0.23    0.00   15.80    0.45    0.00   83.52
>>
>>         Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s  
>>         wsec/s avgrq-sz avgqu-sz   await  svctm  %util
>>         sdc               0.00     4.00    5.00    4.00    43.00  
>>          57.00    11.11     0.08    8.89   8.89   8.00
>>
>>         avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>                   0.28    0.00    4.46    0.00    0.00   95.26
>>
>>         Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s  
>>         wsec/s avgrq-sz avgqu-sz   await  svctm  %util
>>         sdc               0.00     0.00    0.00    0.00     0.00    
>>         0.00     0.00     0.00    0.00   0.00   0.00
>>
>>         avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>                   0.25    0.00    0.25    3.23    0.00   96.28
>>
>>         Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s  
>>         wsec/s avgrq-sz avgqu-sz   await  svctm  %util
>>         sdc               0.00     7.00    1.00   13.00    11.00  
>>         153.00    11.71     0.24   17.14  11.43  16.00
>>
>>         avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>                   0.00    0.00    0.00    0.00    0.00  100.00
>>
>>         Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s  
>>         wsec/s avgrq-sz avgqu-sz   await  svctm  %util
>>         sdc               0.00     0.00    0.00    0.00     0.00    
>>         0.00     0.00     0.00    0.00   0.00   0.00
>>
>>         avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>                   0.00    0.00    0.00    0.00    0.00  100.00
>>
>>         Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s  
>>         wsec/s avgrq-sz avgqu-sz   await  svctm  %util
>>         sdc               0.00     0.00    1.00    1.00    11.00    
>>         1.00     6.00     0.03   15.00  15.00   3.00
>>
>>         vmstat:
>>          procs -----------memory---------- ---swap-- -----io----
>>         -system-- ----cpu----
>>          r  b   swpd   free   buff  cache   si   so    bi    bo   in  
>>         cs us sy id wa
>>          0  0      0  54400 279320 15651312    0    0     9     8    2
>>            4 30  1 69  0
>>          0  0      0  54384 279320 15651316    0    0     6     0   24
>>          299  0  0 100  0
>>          0  0      0  54384 279320 15651316    0    0     0     0   92
>>          409  0  0 100  0
>>          2  0      0  54384 279320 15651316    0    0     5     1   81
>>          386  0  0 100  0
>>          0  0      0  53756 279320 15651352    0    0     8     0  730
>>         1664  0  1 99  0
>>          0  0      0  53232 279320 15651352    0    0     6    88  586
>>         1480  0  0 99  0
>>          0  0      0 242848 279320 15458608    0    0     8     0  348
>>         1149  0  3 97  0
>>          0  0      0 242868 279320 15458608    0    0     5     1  220
>>          721  0  0 100  0
>>          0  0      0 242868 279320 15458608    0    0     0     0  201
>>          709  0  0 100  0
>>          0  0      0 243116 279320 15458608    0    0     6     0  239
>>          775  0  0 100  0
>>          0  0      0 243116 279320 15458608    0    0     0     0  184
>>          676  0  0 100  0
>>          0  0      0 243116 279336 15458608    0    0     5    65  236
>>          756  0  0 99  0
>>          0  0      0 243488 279336 15458608    0    0     0     0  231
>>          791  0  0 100  0
>>          1  0      0 243488 279336 15458608    0    0     6     0  193
>>          697  0  1 100  0
>>          0  0      0 243488 279336 15458608    0    0     0     0  221
>>          762  0  0 100  0
>>          0  0      0 243860 279336 15458608    0    0     9     1  240
>>          793  0  0 100  0
>>          0  0      0 243860 279336 15458608    0    0     0     0  197
>>          708  0  0 100  0
>>          1  0      0 117384 279348 15585384    0    0    26    16  124
>>          524  0 15 84  1
>>          0  0      0  53204 279356 15651364    0    0     0   112  141
>>          432  0  8 91  1
>>          0  0      0  53212 279356 15651320    0    0     5     1   79
>>          388  0  0 100  0
>>          0  0      0  53212 279356 15651320    0    0     0    20   30
>>          301  0  0 100  0
>>          Does this give you any clue to the bottle neck?
>>                    On Mon, Sep 14, 2009 at 9:42 PM, Sunil Mushran
>>         <sunil.mushran at oracle.com <mailto:sunil.mushran at oracle.com>
>>         <mailto:sunil.mushran at oracle.com
>>         <mailto:sunil.mushran at oracle.com>>> wrote:
>>
>>            Get some iostat/vmstat numbers.
>>            # iostat -x /dev/sdX 1
>>            # vmstat 1
>>
>>            How much memory do the nodes have? If more than 2G, XFS
>>            is probably leveraging its delayed allocation feature to
>>         heavily
>>            cache the writes. iostat/vmstat should show that.
>>
>>            Is the timing for the 10 node test cumulative?
>>
>>            Laurence Mayer wrote:
>>
>>                Hi,
>>
>>                I am currently running a 10 Node OCFS2  Cluster (version
>>                1.3.9-0ubuntu1) on Ubuntu Server 8.04 x86_64.
>>                Linux n1 2.6.24-24-server #1 SMP Tue Jul 7 19:39:36 UTC
>>         2009
>>                x86_64 GNU/Linux
>>
>>                The Cluster is connected to a 1Tera iSCSI Device
>>         presented by
>>                an IBM 3300 Storage System, running over a 1Gig Network.
>>                Mounted on all nodes:  /dev/sdc1 on /cfs1 type ocfs2
>>                (rw,_netdev,data=writeback,heartbeat=local)
>>                Maximum Nodes: 32
>>                Block Size=4k
>>                Cluster Size=4k
>>
>>                My testing shows that to write simultaneously from the 10
>>                nodes, 10 x 200Meg files (1 file per node,  total of 
>> 2Gig)
>>                takes ~23.54secs.
>>                Reading the files back can take just as long.
>>
>>                Do these numbers sound correct?
>>
>>                Doing dd if=/dev/zero of=/cfs1/xxxxx/txt count=1000
>>         bs=2048000
>>                (2Gig) from a single node takes 16secs.
>>
>>                (running the same dd command on an XFS filesystem
>>         connected to
>>                the same iSCSI Storage takes 2.2secs)
>>
>>                Is there any tips & tricks to improve performance on 
>> OCFS2?
>>
>>                Thanks in advance
>>                Laurence
>>
>>                _______________________________________________
>>                Ocfs2-users mailing list
>>                Ocfs2-users at oss.oracle.com
>>         <mailto:Ocfs2-users at oss.oracle.com>
>>         <mailto:Ocfs2-users at oss.oracle.com
>>         <mailto:Ocfs2-users at oss.oracle.com>>
>>
>>                http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>               
>>
>>
>>
>