[Ocfs-users] Lock contention issue with ocfs

Sunil Mushran Sunil.Mushran at oracle.com
Thu Mar 11 18:57:03 CST 2004


Wow... I am impressed.

I still need to test it... but it looks good otherwise.

BTW, it's mainly the oracle developers who are
responding on this list. :-)

Jeremy Schneider wrote:

>Hey list...
>
>Sorry for all the emails lately about this bug I'm working on.  I
>finally got tired of waiting for support to track it down so today I
>found it myself and fixed it.  Of course I'm going through normal 
>support channels to get an official [supported] fix; but if anyone's
>interested here's a patch  that'll do it.  And you never know... it
>might speed up the process a bit by posting this explanation.  I tested
>the patch (it only changes a few lines) and it successfully fixed the 
>problem without any side-effects.  I'll try to explain it as clearly as
>possible but programmers aren't always the most brilliant communicators.
> ;)
>
>VERSIONS AFFECTED:
>  all that I'm aware of (including 1.0.10-1)
>
>BUG:
>  the EXCLUSIVE_LOCK on the DirNode is not being released after a file
>is created if the directory has multiple DirNodes and the last DirNode
>(not the one used for locking) has ENABLE_CACHE_LOCK set. 
>ENABLE_CACHE_LOCK can be set on the last DirNode if the DirNode has this
>lock when the 255th file is added because ocfs copies the entire
>structure byte-for-byte (including locks) to the new DirNode it is
>creating and does not clear out the locks.
>
>CAUSE:
>  ocfs_insert_file() in Common/ocfsgendirnode.c (line 1594 in the code
>available through subversion  at oss.oracle.com right now) looks at the
>incorrect variable (DirNode, the last one instead of  LockNode the
>locking one) to check for ENABLE_CACHE_LOCK and does not release the
>EXCLUSIVE_LOCK  because it thinks that the node is holding an
>ENABLE_CACHE_LOCK instead of an EXCLUSIVE_LOCK.  I've  attached a patch
>that fixes it if anyone cares to have a look...
>
>STEPS TO REPRODUCE BUG:
>  create a new ocfs directory.  when you initially create it it will
>have ENABLE_CACHE_LOCK set...   *BEFORE* accessing the directory from
>*ANY* other node (this will release the ENABLE_CACHE_LOCK)  create at
>least 255 files in the directory.  optionally you can check with
>debugocfs -D to see if  the last (not-locking) DirNode has
>ENABLE_CACHE_LOCK set.  *NOW* access it from the other node (to  change
>the locking DirNode from ENABLE_CACHE_LOCK to NO_LOCK) and then create a
>file from either  node -- the EXCLUSIVE_LOCK will not be released. 
>attempting any operation from the opposite node  that requires an
>EXCLUSIVE_LOCK will cause that node to hang until you do an operation
>that  successfully releases the EXCLUSIVE_LOCK (e.g. deleting a file).
>
>FIX:
>  Of course the best solution is to fix ocfs_insert_file() to look at
>the correct DirNode for the  lock.  Another thought would be to update
>the code that creates a new DirNode and have it clear out locks after
>copying the DirNode, although this just seems extraneous to me.
>
>PATCH:
>  I have tested this patch (it is a minimal change) and it successfully
>fixes the bug.  It  releases the lock from the correct pointer (either
>the variable "DirNode" or "LockNode" depending on whether they come from
>the same disk DirNode) but it now also checks the correct variable for
>ENABLE_CACHE_LOCK.
>
>REQUESTED COMPENSATION:
>  hehe...  email me for the address where you can send pizza.  I wonder
>if any oracle developers really do read this list...
>
>
>
>Index: ocfsgendirnode.c
>===================================================================
>--- ocfsgendirnode.c    (revision 5)
>+++ ocfsgendirnode.c    (working copy)
>@@ -1591,17 +1591,26 @@
>                indexOffset = -1;
>        }
>
>-       if (DISK_LOCK_FILE_LOCK (DirNode) !=
>OCFS_DLM_ENABLE_CACHE_LOCK) {
>-               /* This is an optimization... */
>-               ocfs_acquire_lockres (LockResource);
>-               LockResource->lock_type = OCFS_DLM_NO_LOCK;
>-               ocfs_release_lockres (LockResource);
>+       if (LockNode->node_disk_off == DirNode->node_disk_off) {
>+               if (DISK_LOCK_FILE_LOCK (DirNode) !=
>OCFS_DLM_ENABLE_CACHE_LOCK) {
>+                       /* This is an optimization... */
>+                       ocfs_acquire_lockres (LockResource);
>+                       LockResource->lock_type = OCFS_DLM_NO_LOCK;
>+                       ocfs_release_lockres (LockResource);
>
>-               if (LockNode->node_disk_off == DirNode->node_disk_off)
>+                        /* Reset the lock on the disk */
>+                        DISK_LOCK_FILE_LOCK (DirNode) =
>OCFS_DLM_NO_LOCK;
>+               }
>+       } else {
>+                if (DISK_LOCK_FILE_LOCK (LockNode) !=
>OCFS_DLM_ENABLE_CACHE_LOCK) {
>+                        /* This is an optimization... */
>+                        ocfs_acquire_lockres (LockResource);
>+                        LockResource->lock_type = OCFS_DLM_NO_LOCK;
>+                        ocfs_release_lockres (LockResource);
>+
>                        /* Reset the lock on the disk */
>-                       DISK_LOCK_FILE_LOCK (DirNode) =
>OCFS_DLM_NO_LOCK;
>-               else
>                        DISK_LOCK_FILE_LOCK (LockNode) =
>OCFS_DLM_NO_LOCK;
>+               }
>        }
>
>        status = ocfs_write_dir_node (osb, DirNode, indexOffset);
>
>
>
>  
>
>>>>Sunil Mushran <Sunil.Mushran at oracle.com> 03/10/2004 5:49:58 PM >>>
>>>>        
>>>>
>I hope, that when you were reading the dirnode, etc. using debugocfs,
>you were accessing the volume via the raw device. If you weren't, do
>so.
>This is important because that's the only way to ensure directio.
>Else,
>you will be reading potentially stale data from the buffer cache.
>
>Coming to the issue at hand. ls does not take an EXCLUSIVE_LOCK.
>And all EXCLUSIVE_LOCKS are released when the operation is over.
>So am not sure what is happening. Using debugocfs correctly should
>help
>us understand the problem.
>
>Also, whenever you do your file operations (cat etc.) ensure those ops
>are
>o_direct. Now I am not sure why this would cause a problem, but do not
>do buffered operations. ocfs does not support shared mmap.
>
>If you download the 1.0.10 tools, you will not need to manually map
>the
>raw device. The tools do that automatically.
>
>So, upgrade to 1.0.10 module and tools. See if you can reproduce the
>problem.
>
> 
><<<<...>>>>
>  
>





More information about the Ocfs-users mailing list