[Ocfs-users] Lock contention issue with ocfs

Jeremy Schneider jer1887 at asugroup.com
Thu Mar 11 20:27:21 CST 2004


Hey list...

Sorry for all the emails lately about this bug I'm working on.  I
finally got tired of waiting for support to track it down so today I
found it myself and fixed it.  Of course I'm going through normal 
support channels to get an official [supported] fix; but if anyone's
interested here's a patch  that'll do it.  And you never know... it
might speed up the process a bit by posting this explanation.  I tested
the patch (it only changes a few lines) and it successfully fixed the 
problem without any side-effects.  I'll try to explain it as clearly as
possible but programmers aren't always the most brilliant communicators.
 ;)

VERSIONS AFFECTED:
  all that I'm aware of (including 1.0.10-1)

BUG:
  the EXCLUSIVE_LOCK on the DirNode is not being released after a file
is created if the directory has multiple DirNodes and the last DirNode
(not the one used for locking) has ENABLE_CACHE_LOCK set. 
ENABLE_CACHE_LOCK can be set on the last DirNode if the DirNode has this
lock when the 255th file is added because ocfs copies the entire
structure byte-for-byte (including locks) to the new DirNode it is
creating and does not clear out the locks.

CAUSE:
  ocfs_insert_file() in Common/ocfsgendirnode.c (line 1594 in the code
available through subversion  at oss.oracle.com right now) looks at the
incorrect variable (DirNode, the last one instead of  LockNode the
locking one) to check for ENABLE_CACHE_LOCK and does not release the
EXCLUSIVE_LOCK  because it thinks that the node is holding an
ENABLE_CACHE_LOCK instead of an EXCLUSIVE_LOCK.  I've  attached a patch
that fixes it if anyone cares to have a look...

STEPS TO REPRODUCE BUG:
  create a new ocfs directory.  when you initially create it it will
have ENABLE_CACHE_LOCK set...   *BEFORE* accessing the directory from
*ANY* other node (this will release the ENABLE_CACHE_LOCK)  create at
least 255 files in the directory.  optionally you can check with
debugocfs -D to see if  the last (not-locking) DirNode has
ENABLE_CACHE_LOCK set.  *NOW* access it from the other node (to  change
the locking DirNode from ENABLE_CACHE_LOCK to NO_LOCK) and then create a
file from either  node -- the EXCLUSIVE_LOCK will not be released. 
attempting any operation from the opposite node  that requires an
EXCLUSIVE_LOCK will cause that node to hang until you do an operation
that  successfully releases the EXCLUSIVE_LOCK (e.g. deleting a file).

FIX:
  Of course the best solution is to fix ocfs_insert_file() to look at
the correct DirNode for the  lock.  Another thought would be to update
the code that creates a new DirNode and have it clear out locks after
copying the DirNode, although this just seems extraneous to me.

PATCH:
  I have tested this patch (it is a minimal change) and it successfully
fixes the bug.  It  releases the lock from the correct pointer (either
the variable "DirNode" or "LockNode" depending on whether they come from
the same disk DirNode) but it now also checks the correct variable for
ENABLE_CACHE_LOCK.

REQUESTED COMPENSATION:
  hehe...  email me for the address where you can send pizza.  I wonder
if any oracle developers really do read this list...



Index: ocfsgendirnode.c
===================================================================
--- ocfsgendirnode.c    (revision 5)
+++ ocfsgendirnode.c    (working copy)
@@ -1591,17 +1591,26 @@
                indexOffset = -1;
        }

-       if (DISK_LOCK_FILE_LOCK (DirNode) !=
OCFS_DLM_ENABLE_CACHE_LOCK) {
-               /* This is an optimization... */
-               ocfs_acquire_lockres (LockResource);
-               LockResource->lock_type = OCFS_DLM_NO_LOCK;
-               ocfs_release_lockres (LockResource);
+       if (LockNode->node_disk_off == DirNode->node_disk_off) {
+               if (DISK_LOCK_FILE_LOCK (DirNode) !=
OCFS_DLM_ENABLE_CACHE_LOCK) {
+                       /* This is an optimization... */
+                       ocfs_acquire_lockres (LockResource);
+                       LockResource->lock_type = OCFS_DLM_NO_LOCK;
+                       ocfs_release_lockres (LockResource);

-               if (LockNode->node_disk_off == DirNode->node_disk_off)
+                        /* Reset the lock on the disk */
+                        DISK_LOCK_FILE_LOCK (DirNode) =
OCFS_DLM_NO_LOCK;
+               }
+       } else {
+                if (DISK_LOCK_FILE_LOCK (LockNode) !=
OCFS_DLM_ENABLE_CACHE_LOCK) {
+                        /* This is an optimization... */
+                        ocfs_acquire_lockres (LockResource);
+                        LockResource->lock_type = OCFS_DLM_NO_LOCK;
+                        ocfs_release_lockres (LockResource);
+
                        /* Reset the lock on the disk */
-                       DISK_LOCK_FILE_LOCK (DirNode) =
OCFS_DLM_NO_LOCK;
-               else
                        DISK_LOCK_FILE_LOCK (LockNode) =
OCFS_DLM_NO_LOCK;
+               }
        }

        status = ocfs_write_dir_node (osb, DirNode, indexOffset);



>>> Sunil Mushran <Sunil.Mushran at oracle.com> 03/10/2004 5:49:58 PM >>>
I hope, that when you were reading the dirnode, etc. using debugocfs,
you were accessing the volume via the raw device. If you weren't, do
so.
This is important because that's the only way to ensure directio.
Else,
you will be reading potentially stale data from the buffer cache.

Coming to the issue at hand. ls does not take an EXCLUSIVE_LOCK.
And all EXCLUSIVE_LOCKS are released when the operation is over.
So am not sure what is happening. Using debugocfs correctly should
help
us understand the problem.

Also, whenever you do your file operations (cat etc.) ensure those ops
are
o_direct. Now I am not sure why this would cause a problem, but do not
do buffered operations. ocfs does not support shared mmap.

If you download the 1.0.10 tools, you will not need to manually map
the
raw device. The tools do that automatically.

So, upgrade to 1.0.10 module and tools. See if you can reproduce the
problem.

 
<<<<...>>>>


More information about the Ocfs-users mailing list