[Ocfs-users] Lock contention issue with ocfs

Jeremy Schneider jer1887 at asugroup.com
Thu Mar 11 09:50:58 CST 2004


Ahh... you were exactly right about raw devices, Sunil.  debugocfs was
reading the DirNode from the buffered device.  I mapped a raw device and
tried again with much clearer results - the reason running 'ls' made the
lock show up is that when I did an 'ls', ocfs forced the buffered device
to refresh it's  buffer - and that's when the EXCLUSIVE_LOCK obtained by
the opposite node showed up to debugocfs.

It seems that somehow the EXCLUSIVE_LOCK is not being released after
ocfs creates a file.  It doesn't matter whether or not I access the file
in o_direct mode.  If I use dd with the o_direct flag to create the file
the EXCLUSIVE_LOCK on the DirNode is still not released until I remove
the file.  I bet you could reproduce this on a lab machine (it seems
like the kind of problem that should be reproducible) - if anyone's
interested, email me and I can give you specific steps to reproduce.

I have not tried with the 1.0.10 tools yet; I'll do that.  (It doesn't
sound quite right, but I'm not exactly an ocfs guru <g> and it's the
best suggestion I've received so far...)  If anyone on the list has any
other suggestions, please let me know.

Also, if anyone over at Oracle would like to do me a favor and somehow
straighten out the mess with the support case I have open (let them know
it's not a hardware issue and get it out of BDE) I'd appreciate that
too...  just email me for details...

Thanks again,
Jeremy


>>> Sunil Mushran <Sunil.Mushran at oracle.com> 03/10/2004 5:49:58 PM >>>
> I hope, that when you were reading the dirnode, etc. using
debugocfs,
> you were accessing the volume via the raw device. If you weren't, do
so.
> This is important because that's the only way to ensure directio.
Else,
> you will be reading potentially stale data from the buffer cache.
>
> Coming to the issue at hand. ls does not take an EXCLUSIVE_LOCK.
> And all EXCLUSIVE_LOCKS are released when the operation is over.
> So am not sure what is happening. Using debugocfs correctly should
help
> us understand the problem.
>
> Also, whenever you do your file operations (cat etc.) ensure those
ops are
> o_direct. Now I am not sure why this would cause a problem, but do
not
> do buffered operations. ocfs does not support shared mmap.
> 
> If you download the 1.0.10 tools, you will not need to manually map
the
> raw device. The tools do that automatically.
> 
> So, upgrade to 1.0.10 module and tools. See if you can reproduce the
> problem.

 
<<<<...>>>>


More information about the Ocfs-users mailing list