[Ocfs2-users] fcntl exclusive lock implementation in ocfs2

Sunil Mushran Sunil.Mushran at oracle.com
Thu Apr 5 17:20:00 PDT 2007


ocfs2 currently lets vfs handle fcntl locking.

Jeff Fookson wrote:
> I am currently testing ocfs2 for use in a two-node cluster that will 
> run the Cyrus imapd and am having issues
> that seem to be related to occasionally long times being needed while 
> the software blocks waiting to get a writelock
> via the 'fcntl' system call. I am aware that the current ocfs2 
> supports neither a writable mmap nor a cluster-aware
> flock, so my tests are done doing all writing to only one node of the 
> cluster and the Cyrus configuration
> is such that none of the requisite databases require a writable 'mmap' 
> (i.e. all databases are skiplist, not Berkeley DB).
> I am using drbd to provide the appropriate
> support for having the disks on the two nodes to behave as a shared 
> resource; as permitted by drbd, version 8,
> the disks on both nodes are drbd primaries and mounted on their 
> respective machines. I am testing by having modest
> size mail messages delivered to just one of the machines at the rate 
> of 1/sec. The system will run fine in this mode, sometimes for
> days but then will get hopelessly wedged with many   'lmtpd' processes 
> waiting to get exclusive locks on the various Cyrus
> databases. As the system approaches this deadlock condition, 'strace' 
> shows times of many seconds being spent in 'fcntl'
> waiting for the lock and the load average skyrockets because of all 
> the 'lmtpd' processes.
> Since mail is being delivered at essentially a constant rate and there 
> is no other activity on the systems, I'm confused
> as to how the machines will often run for extended times before 
> suddenly getting into this pathological state.
>
> I realize that because my setup is using several complex layers 
> (actually the full storage design has
>
> md->drbd->lvm->ocfs2->Cyrus imapd)  I will also consult the drbd and 
> Cyrus mailing lists, but I'm hoping
> that someone on this list might have some insight into how fcntl-based 
> locking is implemented under ocfs2
> that may help point the way to what is causing the deadlock after many 
> days of running well.
>
> The machines are both running CentOS 4.4 with a 2.6.19 kernel; the 
> ocfs2 code is that included with the kernel
> sources; drbd is version 8.0 and the Cyrus version is 2.3.8.
>
> Thank you for any thoughts on this matter.
>
> Jeff Fookson
>



More information about the Ocfs2-users mailing list