[Ocfs2-users] OCFS2 benchmark slow concurrent write

Alexei_Roudnev Alexei_Roudnev at exigengroup.com
Wed Jun 27 11:41:36 PDT 2007


Mark.

Did you try to use interconnection to make additional 'ocfs driver <->
disks' communication channel? It could help in some cases, such as
'one node experienced IO errors while other do not', and can help to detect
the case 'overall IO problem so no sense to fence until at least some nodes
restore disk conenction'.

Notice, that it is all low level IO. Of course, there can be a problem
because of system caching mechanizm, but just in case if it can be
implemented, such bypass could increase ocfsv2 stability dramatically (and
eliminate most of well known undesirable 'self fencing' cases).

In addition, I wounder if drdb can be used except for the testing - it makes
very possible simultaneous _interconnection glitch and disk
desyncronization_ so making possible many failure scenarious.



----- Original Message ----- 
From: "Mark Fasheh" <mark.fasheh at oracle.com>
To: "Philipp Wehrheim" <wehrheim at glue.ch>
Cc: <ocfs2-users at oss.oracle.com>
Sent: Wednesday, June 27, 2007 11:12 AM
Subject: Re: [Ocfs2-users] OCFS2 benchmark slow concurrent write


> On Wed, Jun 27, 2007 at 02:11:00PM +0200, Philipp Wehrheim wrote:
> > I did some more benchmarks writing 99 chars (one line) into a file
> > but the time this takes vary between 40 us and 1s so this is way to
> > variable for our applications.
> > The average for writing one line is 175 ms what is ok.
> >
> > The benchmark was done with different kernel:
> >
> > - with and without preemption
> > - with 100 and 1000Hz clock frequency
> > - etc
>
> As Sunil noted, I'd suggest you try a gigabit connection.
>
> But the purpose of my e-mail is really to answer your second question
below :)
>
>
> > I also mailed to the drbd list and the response was that it is
> > probably an dlm issue/dlm = bottleneck.
> > Further more the recommendation was that I should nether write
> > concurrently into one file nor into two file in one directory.
> >
> > Is this right?
>
> If you're looking to gain maximum performance, yes. Either that or you do
> writes with O_DIRECT in which case the file system will avoid some lock
> pinging. Meta data and buffered data operations have to always be cache
> coherent though.
>
> Basically transferring control of a shared resource between nodes is
> expensive - it involves some combination of journal flushes, data
writeout,
> and cache invalidation depending on what level of control is being given
up.
> For what it's worth, this is a bottleneck in most symmetric shared disk
file
> systems (at least the ones I've in detail looked at). Ocfs2 tries very
hard
> to keep operations node local - we have a purely node local allocation
cache
> as well as a deallocation cache and many per-node system files. Ultimately
> though individual user files and directories have to be seen by all nodes.
>
> So if you're at the stage where you can design the layout of your
> application I'd recommend that the performance sensitive components avoid
> concurrent buffered writes to shared files or rapid creation of files in
> shared directories. The case where there is one writer and many readers
> performs much better, but will still be slower than each node chugging
away
> in their own area.
> --Mark
>
> --
> Mark Fasheh
> Senior Software Developer, Oracle
> mark.fasheh at oracle.com
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>




More information about the Ocfs2-users mailing list