[Ocfs2-users] fsck.ocfs2 loops + hangs but does not check
Joseph Qi
joseph.qi at huawei.com
Wed Mar 23 17:30:42 PDT 2016
Hi Michael,
Could you please use debugfs to check the output?
# debugfs.ocfs2 -R 'stat //global_bitmap' <device>
Thanks,
Joseph
On 2016/3/24 6:38, Michael Ulbrich wrote:
> Hi ocfs2-users,
>
> my first post to this list from yesterday probably didn't get through.
>
> Anyway, I've made some progress in the meantime and may now ask more
> specific questions ...
>
> I'm having issues with an 11 TB ocfs2 shared filesystem on Debian Wheezy:
>
> Linux s1a 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64 GNU/Linux
>
> the kernel modules are:
>
> modinfo ocfs2 -> version: 1.5.0
>
> using stock ocfs2-tools 1.6.4-1+deb7u1 from the distri.
>
> As an alternative I cloned and built the latest ocfs2-tools from
> markfasheh's ocfs2-tools on github which should be version 1.8.4.
>
> The filesystem runs on top of drbd, is used to roughly 40 % and suffers
> from read-only remounts and hanging clients since the last reboot. This
> may be DLM problems but I suspect they stem from some corrupt disk
> structures. Before that it all ran stable for months.
>
> This situation made me want to run fsck.ocfs2 and now I wonder how to do
> that. The filesystem is not mounted.
>
> With the stock ocfs-tools 1.6.4:
>
> root at s1a:~# fsck.ocfs2 -v -f /dev/drbd1 > fsck_drbd1.log 2>&1
> fsck.ocfs2 1.6.4
> Checking OCFS2 filesystem in /dev/drbd1:
> Label: ocfs2_ASSET
> UUID: 6A1A0189A3F94E32B6B9A526DF9060F3
> Number of blocks: 5557283182
> Block size: 2048
> Number of clusters: 2778641591
> Cluster size: 4096
> Number of slots: 16
>
> I'm checking fsck_drbd1.log and find that it is making progress in
>
> Pass 0a: Checking cluster allocation chains
>
> until it reaches "chain 73" and goes into an infinite loop filling the
> logfile with breathtaking speed.
>
> With the newly built ocfs-tools 1.8.4 I get:
>
> root at s1a:~# fsck.ocfs2 -v -f /dev/drbd1 > fsck_drbd1.log 2>&1
> fsck.ocfs2 1.8.4
> Checking OCFS2 filesystem in /dev/drbd1:
> Label: ocfs2_ASSET
> UUID: 6A1A0189A3F94E32B6B9A526DF9060F3
> Number of blocks: 5557283182
> Block size: 2048
> Number of clusters: 2778641591
> Cluster size: 4096
> Number of slots: 16
>
> Again watching the verbose output in fsck_drbd1.log I find that this
> time it proceeds up to
>
> Pass 0a: Checking cluster allocation chains
> o2fsck_pass0:1360 | found inode alloc 13 at block 13
>
> and stays there without any further progress. I've terminated this
> process after waiting for more than an hour.
>
> Now - I'm lost somehow ... and would very much appreciate if anybody on
> this list would share his knowledge and give me a hint what to do next.
>
> What could be done to get this file system checked and repaired? Am I
> missing something important or do I just have to wait a little bit
> longer? Is there a version of ocfs2-tools / fsck.ocfs2 which will
> perform as expected?
>
> I'm prepared to upgrade the kernel to 3.16.0-0.bpo.4-amd64 but shy away
> from taking that risk without any clue of whether that might solve my
> problem ...
>
> Thanks in advance ... Michael Ulbrich
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
More information about the Ocfs2-users
mailing list