[Ocfs2-users] fsck.ocfs2 loops + hangs but does not check

Joseph Qi joseph.qi at huawei.com
Wed Mar 23 17:30:42 PDT 2016


Hi Michael,
Could you please use debugfs to check the output?
# debugfs.ocfs2 -R 'stat //global_bitmap' <device>

Thanks,
Joseph

On 2016/3/24 6:38, Michael Ulbrich wrote:
> Hi ocfs2-users,
> 
> my first post to this list from yesterday probably didn't get through.
> 
> Anyway, I've made some progress in the meantime and may now ask more
> specific questions ...
> 
> I'm having issues with an 11 TB ocfs2 shared filesystem on Debian Wheezy:
> 
> Linux s1a 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64 GNU/Linux
> 
> the kernel modules are:
> 
> modinfo ocfs2 -> version: 1.5.0
> 
> using stock ocfs2-tools 1.6.4-1+deb7u1 from the distri.
> 
> As an alternative I cloned and built the latest ocfs2-tools from
> markfasheh's ocfs2-tools on github which should be version 1.8.4.
> 
> The filesystem runs on top of drbd, is used to roughly 40 % and suffers
> from read-only remounts and hanging clients since the last reboot. This
> may be DLM problems but I suspect they stem from some corrupt disk
> structures. Before that it all ran stable for months.
> 
> This situation made me want to run fsck.ocfs2 and now I wonder how to do
> that. The filesystem is not mounted.
> 
> With the stock ocfs-tools 1.6.4:
> 
> root at s1a:~# fsck.ocfs2 -v -f /dev/drbd1 > fsck_drbd1.log 2>&1
> fsck.ocfs2 1.6.4
> Checking OCFS2 filesystem in /dev/drbd1:
>   Label:              ocfs2_ASSET
>   UUID:               6A1A0189A3F94E32B6B9A526DF9060F3
>   Number of blocks:   5557283182
>   Block size:         2048
>   Number of clusters: 2778641591
>   Cluster size:       4096
>   Number of slots:    16
> 
> I'm checking fsck_drbd1.log and find that it is making progress in
> 
> Pass 0a: Checking cluster allocation chains
> 
> until it reaches "chain 73" and goes into an infinite loop filling the
> logfile with breathtaking speed.
> 
> With the newly built ocfs-tools 1.8.4 I get:
> 
> root at s1a:~# fsck.ocfs2 -v -f /dev/drbd1 > fsck_drbd1.log 2>&1
> fsck.ocfs2 1.8.4
> Checking OCFS2 filesystem in /dev/drbd1:
>   Label:              ocfs2_ASSET
>   UUID:               6A1A0189A3F94E32B6B9A526DF9060F3
>   Number of blocks:   5557283182
>   Block size:         2048
>   Number of clusters: 2778641591
>   Cluster size:       4096
>   Number of slots:    16
> 
> Again watching the verbose output in fsck_drbd1.log I find that this
> time it proceeds up to
> 
> Pass 0a: Checking cluster allocation chains
> o2fsck_pass0:1360 | found inode alloc 13 at block 13
> 
> and stays there without any further progress. I've terminated this
> process after waiting for more than an hour.
> 
> Now - I'm lost somehow ... and would very much appreciate if anybody on
> this list would share his knowledge and give me a hint what to do next.
> 
> What could be done to get this file system checked and repaired? Am I
> missing something important or do I just have to wait a little bit
> longer? Is there a version of ocfs2-tools / fsck.ocfs2 which will
> perform as expected?
> 
> I'm prepared to upgrade the kernel to 3.16.0-0.bpo.4-amd64 but shy away
> from taking that risk without any clue of whether that might solve my
> problem ...
> 
> Thanks in advance ... Michael Ulbrich
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
> 
> 





More information about the Ocfs2-users mailing list