[Ocfs2-users] ocfs2 on mailserver lock up

Sunil Mushran Sunil.Mushran at oracle.com
Tue Sep 26 13:13:00 PDT 2006


I will need atleast "strace -tt -T" and /proc/meminfo, /proc/slabinfo to 
proceed.
write() could be slow for a lot of reasons.

augustasg at gmail.com wrote:
>
>
> On 9/26/06, *Sunil Mushran* <Sunil.Mushran at oracle.com 
> <mailto:Sunil.Mushran at oracle.com>> wrote:
>
>     You may want to ping Novell to get the 1.2.3 drop of OCFS2. That's
>     because it is the latest and greatest.
>
>     Having said that we'll need more information.
>
>     As in, what syscall did strace show as taking time.
>
>
>  write() is extremely slow.
>
>     What is the memory usage like? cat /proc/meminfo, cat /proc/slabinfo
>     That is, under production load.
>
>
> unfortunately i have no  information  about the exact contents of the 
> files you request, but I should say that memory and CPU usage was 
> normal when the lockups happened.
>
>     augustasg at gmail.com <mailto:augustasg at gmail.com> wrote:
>     > Hello,
>     >
>     > we are running SLES9 SP3 with OCFS2 for mailsystem. There are three
>     > nodes in the cluster with shared storage and OCFS2 filesystem on it.
>     > The filesystem is used for mailbox storage and is accessed by smtpd
>     > ,pop3 and imap processes. The system works fine for a few hours but
>     > locks up so that the OCFS2 filesystem is accessed by mail system
>     > extremely slowly. It is possible to list the contents of the
>     > filesystem or to change the directories, but mailsystem is working
>     > slowly. Tracing smtp processes with strace showed that mail is
>     > delivered to mailboxes but it is done in a speed of few
>     kilobytes per
>     > second.
>     > This lock up happens under heavier load when the cluster is in
>     > production use and when it happens only restart fixes the
>     problem. The
>     > umounting of the OCFS2 filesystem is not possible even when all
>     > process accessing it are killed. Currently mail system cluster
>     is not
>     > used in production so we tried to do some tests, but we were not
>     able
>     > to replicate the problem while sending several thousands of
>     messages
>     > to a single account.
>     > The only thing that comes in mind, that the filesystem locks in the
>     > case when a file (message) is access by smtp process and pop3 or
>     imap
>     > process simultaniously when the processes are running on distinct
>     > cluster nodes. This might happen because of lack of global knowledge
>     > of localy locked files among OCFS2 cluster nodes.
>     > Could this be true?
>     > And what other reasons might be there for the problem?
>     > Any suggestions on solving the problem?
>     >
>     > The system information follows:
>     > OS: Suse Linux Enterprise Server 9 Service Pack 3
>     > Mailbox format: Maildir
>     > SMTP daemon: Postfix (postfix-2.2.6-0.1 )
>     > OCFS2 version: 1.2.1
>     > Kernel Version: 2.6.5-7.276-smp
>     > Pop3 and imap  daemon: Courier IMAP (courier-imap-4.1.0-1.suse910)
>     >
>     > --
>     > Augustas Gutautas
>     >
>     ------------------------------------------------------------------------
>
>     >
>     > _______________________________________________
>     > Ocfs2-users mailing list
>     > Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>     > http://oss.oracle.com/mailman/listinfo/ocfs2-users
>     >
>
>
>
>
> -- 
> Augustas Gutautas
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   



More information about the Ocfs2-users mailing list