[Ocfs2-users] Tracking down hangs

Andrew Robert Nicols andrew.nicols at luns.net.uk
Fri Jun 4 07:17:00 PDT 2010


Hi Sunil,

Thanks for the reply.

On Thu, Jun 03, 2010 at 02:18:53PM -0700, Sunil Mushran wrote:
> If scanlocks is clean, means it is not a dlm issue.

If the hang is only short, could it be that we're just missing the relevant
busy locks by running scanlocks too late?

> Have you tried mounting with data=writeback? With drbd,
> a 1G write becomes a 2G write. With ordered mode, a journal
> checkpoint, which is done when relinquishing a write lock, will
> wait on the data flush. That could be the cause for the slowdown.

I've remounted with data=writeback on the nfs server and under normal load,
we're still seeing hangs fairly frequently. I'm having real difficulty in
tracking down the cause of the issues.

I've moved away from catting the same file on each server to reading a
different file on each server. This has reduced the frequency of the issue
slightly, but not altogether.

> Does drbd have any way to see how active it is at that time? If
> so, monitor that.

We've checked out the drbd link and it appears untaxed when we see these
glitches.

Thanks in advance,

Andrew Nicols

-- 
Systems Developer

e: andrew.nicols at luns.net.uk
im: a.nicols at jabber.lancs.ac.uk
t: +44 (0)1524 5 10147

Lancaster University Network Services is a limited company registered in
England and Wales. Registered number: 04311892. Registered office:
University House, Lancaster University, Lancaster, LA1 4YW
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20100604/797a0df1/attachment.bin 


More information about the Ocfs2-users mailing list