[Ocfs2-users] Huge Problem ocfs2
Marian Serban
marian at easic.ro
Fri Nov 9 18:25:28 PST 2012
debugfs: ls /
ls: Bad magic number in inode while checking directory at block 129
On 10.11.2012 04:24, Sunil Mushran wrote:
> Yes that should be enough for that. But that won't help if the real
> problem is device related.
>
> What does debugfs.ocfs2 -R "ls -l /" return? If that errors, means the
> root dir is gone. Maybe
> best to look into your backups.
>
>
> On Fri, Nov 9, 2012 at 6:01 PM, Marian Serban <marian at easic.ro
> <mailto:marian at easic.ro>> wrote:
>
> Nope, rdump doesn't work either.
>
> debugfs: rdump -v / /tmp
> Copying to /tmp/
> rdump: Bad magic number in inode while reading inode 129
> rdump: Bad magic number in inode while recursively dumping inode 129
>
>
> Could you please confirm that it's enough to just force the return
> value of 0 at "ocfs2_validate_meta_ecc" in order to bypass the ECC
> checks?
>
>
>
>
> On 10.11.2012 03:55, Sunil Mushran wrote:
>> If global bitmap is gone. then the fs is unusable. But you can
>> extract data using
>> the rdump command in debugfs.ocfs. The success depends on how
>> much of the
>> device is still usable.
>>
>>
>> On Fri, Nov 9, 2012 at 5:50 PM, Marian Serban <marian at easic.ro
>> <mailto:marian at easic.ro>> wrote:
>>
>> I tried hacking the fsck.ocfs2 source code by not considering
>> metaecc flag. Then I ran into
>>
>> journal recovery: Bad magic number in inode while looking up
>> the journal inode for slot 0
>>
>> fsck encountered unrecoverable errors while replaying the
>> journals and will not continue
>>
>> After bypassing journal replay function, I got
>>
>> Pass 0a: Checking cluster allocation chains
>> pass0: Bad magic number in inode while looking up the global
>> bitmap inode
>> fsck.ocfs2: Bad magic number in inode while performing pass 0
>>
>>
>> Does it mean the filesystem is destroyed completely?
>>
>>
>>
>>
>> On 10.11.2012 02:54, Marian Serban wrote:
>>> That's the kernel:
>>>
>>> Linux ro02xsrv003.bv.easic.ro
>>> <http://ro02xsrv003.bv.easic.ro> 2.6.39.4 #6 SMP Mon Dec 12
>>> 12:09:49 EET 2011 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> Anyway, I tried disabling the metaecc feature, no luck.
>>>
>>> [root at ro02xsrv003 ~]# tunefs.ocfs2 --fs-features=nometaecc
>>> /dev/mapper/volgr1-lvol0
>>> tunefs.ocfs2: I/O error on channel while opening device
>>> "/dev/mapper/volgr1-lvol0"
>>>
>>> These are the last lines of strace corresponding to the
>>> tunefs.ocfs command:
>>>
>>>
>>>
>>> open("/sys/fs/ocfs2/cluster_stack", O_RDONLY) = 4
>>> fstat(4, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE,
>>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f54aad05000
>>> read(4, "o2cb\n", 4096) = 5
>>> close(4) = 0
>>> munmap(0x7f54aad05000, 4096) = 0
>>> open("/sys/fs/o2cb/interface_revision", O_RDONLY) = 4
>>> read(4, "5\n", 15) = 2
>>> read(4, "", 13) = 0
>>> close(4) = 0
>>> stat("/sys/kernel/config", {st_mode=S_IFDIR|0755, st_size=0,
>>> ...}) = 0
>>> statfs("/sys/kernel/config", {f_type=0x62656570,
>>> f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0,
>>> f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
>>> open("/dev/mapper/volgr1-lvol0", O_RDONLY) = 4
>>> ioctl(4, BLKSSZGET, 0x7fffce711454) = 0
>>> close(4) = 0
>>> pread(3,
>>> "\0\0\v\25\37\1\200\200\202@\21\2\30\26\0\0\0,\17\272\241\4\340\210\311\377\17\300\327\332\373\17"...,
>>> 4096, 532480) = 4096
>>> close(3) = 0
>>> write(2, "tunefs.ocfs2", 12tunefs.ocfs2) = 12
>>> write(2, ": ", 2: ) = 2
>>> write(2, "I/O error on channel", 20I/O error on channel) = 20
>>> write(2, " ", 1 ) = 1
>>> write(2, "while opening device \"/dev/mappe"..., 47while
>>> opening device "/dev/mapper/volgr1-lvol0") = 47
>>> write(2, "\r\n", 2
>>>
>>>
>>>
>>>
>>>
>>> On 10.11.2012 02:06, Sunil Mushran wrote:
>>>> It's either that or a check sum problem. Disable metaecc.
>>>> Not sure which kernel you are running.
>>>> We had fixed few problems few years ago around this. If
>>>> your kernel is older, then it could be
>>>> a known issue.
>>>>
>>>>
>>>> On Fri, Nov 9, 2012 at 12:50 PM, Marian Serban
>>>> <marian at easic.ro <mailto:marian at easic.ro>> wrote:
>>>>
>>>> Hi Sunil,
>>>>
>>>> Thank you for answering. Unfortunately, it doesn't seem
>>>> like it's a hardware problem. There's no way a cable
>>>> can be loose because it's iSCSI over 1G Ethernet
>>>> (copper wires) environment. Also I performed "dd
>>>> if=/dev/.... of=/dev/null" and first 16GB or so are
>>>> fine. "Dmesg" shows no errors.
>>>>
>>>>
>>>> Also tried with debugfs.ocfs2:
>>>>
>>>>
>>>> [root at ro02xsrv003 ~]# debugfs.ocfs2
>>>> /dev/mapper/volgr1-lvol0
>>>> debugfs.ocfs2 1.6.3
>>>> debugfs: ls
>>>> ls: Bad magic number in inode '.'
>>>> debugfs: slotmap
>>>> slotmap: Bad magic number in inode while reading
>>>> slotmap system file
>>>> debugfs: stats
>>>> Revision: 0.90
>>>> Mount Count: 0 Max Mount Count: 20
>>>> State: 0 Errors: 0
>>>> Check Interval: 0 Last Check: Fri Nov 9
>>>> 14:35:53 2012
>>>> Creator OS: 0
>>>> Feature Compat: 3 backup-super strict-journal-super
>>>> Feature Incompat: 16208 sparse extended-slotmap
>>>> inline-data metaecc xattr indexed-dirs refcount
>>>> discontig-bg
>>>> Tunefs Incomplete: 0
>>>> Feature RO compat: 7 unwritten usrquota grpquota
>>>> Root Blknum: 129 System Dir Blknum: 130
>>>> First Cluster Group Blknum: 64
>>>> Block Size Bits: 12 Cluster Size Bits: 18
>>>> Max Node Slots: 10
>>>> Extended Attributes Inline Size: 256
>>>> Label: SAN
>>>> UUID: B4CF8D4667AF43118F3324567B90A987
>>>> Hash: 3698209293 (0xdc6e320d)
>>>> DX Seed[0]: 0x9f4a2bb7
>>>> DX Seed[1]: 0x501ddac0
>>>> DX Seed[2]: 0x6034bfe8
>>>> Cluster stack: classic o2cb
>>>> Inode: 2 Mode: 00 Generation: 1093568923
>>>> (0x412e899b)
>>>> FS Generation: 1093568923 (0x412e899b)
>>>> CRC32: 46f2d360 ECC: 04d4
>>>> Type: Unknown Attr: 0x0 Flags: Valid System
>>>> Superblock
>>>> Dynamic Features: (0x0)
>>>> User: 0 (root) Group: 0 (root) Size: 0
>>>> Links: 0 Clusters: 45340448
>>>> ctime: 0x4ee67f67 -- Tue Dec 13 00:25:43 2011
>>>> atime: 0x0 -- Thu Jan 1 02:00:00 1970
>>>> mtime: 0x4ee67f67 -- Tue Dec 13 00:25:43 2011
>>>> dtime: 0x0 -- Thu Jan 1 02:00:00 1970
>>>> ctime_nsec: 0x00000000 -- 0
>>>> atime_nsec: 0x00000000 -- 0
>>>> mtime_nsec: 0x00000000 -- 0
>>>> Refcount Block: 0
>>>> Last Extblk: 0 Orphan Slot: 0
>>>> Sub Alloc Slot: Global Sub Alloc Bit: 65535
>>>>
>>>>
>>>>
>>>>
>>>> Marian
>>>>
>>>>
>>>> _______________________________________________
>>>> Ocfs2-users mailing list
>>>> Ocfs2-users at oss.oracle.com
>>>> <mailto:Ocfs2-users at oss.oracle.com>
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>
>>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20121110/0a136c98/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3973 bytes
Desc: S/MIME Cryptographic Signature
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20121110/0a136c98/attachment-0001.bin
More information about the Ocfs2-users
mailing list