[Ocfs2-users] OCFS2 hanging on writes

srinivas eeda srinivas.eeda at oracle.com
Thu Oct 25 20:34:52 PDT 2012


I believe the problem could be due to fragmentation.
1) Can you run the following script and email me the output
https://oss.oracle.com/~seeda/misc/stat_sysdir.sh
run it as stat_sysdir.sh -d <dev>

2) can you also do the following and provide me the fs state
mount -t debugfs debugfs /sys/kernel/debug
cat /sys/kernel/debug/ocfs2/*/fs_state

On 10/25/2012 6:32 PM, Jeff Paterson wrote:
> Hello,
>
> I would need help with our OCFS2 (1.8.0) filesystem.  We are having 
> problems with it since a couple days.  When we write onto it, it hangs.
>
> The "hanging pattern" is easily reproductible.  If I write a 1GB file 
> on the filesystem, it does the following:
>         - write ~200 MB of data on the disk in 1 second
>         - freeze for about 10 seconds
>         - write ~200 MB of data on the disk in 1 second
>         - freeze for about 10 seconds
>         - write ~200 MB of data on the disk in 1 second
>         - freeze for about 10 seconds
>         (and so on)
>
> When the freezes occur:
>         - other writes operations (from other processes) on the same 
> node also freeze
>         - writes operations on other nodes are not affected by the 
> freezes on another node
> Read operations (on any cluster node, even the one with frozen writes) 
> don't seem to be affected by the freezes.  One sure thing, read 
> operations alone don't cause the filesystem freeze.
>
> For info, before the problem began to appear we could sustain 640 MB/s 
> writes without any freeze.
>
> I tried to mount the filesystem on a single node to avoid issues that 
> could happen with inter-node communications and the problem was still 
> there.
>
>
> *_Filesystem details_*
>
>   * The filesystem has 18 TB and it is currently 72% full.
>   * Mount options are the following:
>     rw,nodev,_netdev,noatime,errors=panic,data=writeback,noacl,nouser_xattr,commit=60,heartbeat=local
>   * All Features: backup-super strict-journal-super sparse
>     extended-slotmap inline-data metaecc indexed-dirs refcount
>     discontig-bg unwritten
>
>
>
> There is nothing special in the systems logs beside application errors 
> caused by the freezes.
>
>
> Would a fsck.ocfs2 help?   How long would it take for 18 TB?
>
> Is there a flag I can enable in debugfs.ocfs2 to get a better idea of 
> what is happening and why it is freezing like that?
>
>
> Any help would be greatly appreciated.
>
> Thanks in advance,
>
> Jeff
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20121025/1a246f3f/attachment.html 


More information about the Ocfs2-users mailing list