[Ocfs2-users] Huge Problem ocfs2

Sunil Mushran sunil.mushran at gmail.com
Fri Nov 9 18:24:30 PST 2012


Yes that should be enough for that. But that won't help if the real problem
is device related.

What does debugfs.ocfs2 -R "ls -l /" return? If that errors, means the root
dir is gone. Maybe
best to look into your backups.


On Fri, Nov 9, 2012 at 6:01 PM, Marian Serban <marian at easic.ro> wrote:

>  Nope, rdump doesn't work either.
>
> debugfs: rdump -v / /tmp
> Copying to /tmp/
> rdump: Bad magic number in inode while reading inode 129
> rdump: Bad magic number in inode while recursively dumping inode 129
>
>
> Could you please confirm that it's enough to just force the return value
> of 0 at "ocfs2_validate_meta_ecc" in order to bypass the ECC checks?
>
>
>
>
> On 10.11.2012 03:55, Sunil Mushran wrote:
>
> If global bitmap is gone. then the fs is unusable. But you can extract
> data using
> the rdump command in debugfs.ocfs. The success depends on how much of the
> device is still usable.
>
>
> On Fri, Nov 9, 2012 at 5:50 PM, Marian Serban <marian at easic.ro> wrote:
>
>>  I tried hacking the fsck.ocfs2 source code by not considering metaecc
>> flag. Then I ran into
>>
>> journal recovery: Bad magic number in inode while looking up the journal
>> inode for slot 0
>>
>> fsck encountered unrecoverable errors while replaying the journals and
>> will not continue
>>
>>  After bypassing journal replay function, I got
>>
>> Pass 0a: Checking cluster allocation chains
>> pass0: Bad magic number in inode while looking up the global bitmap inode
>> fsck.ocfs2: Bad magic number in inode while performing pass 0
>>
>>
>> Does it mean the filesystem is destroyed completely?
>>
>>
>>
>>
>> On 10.11.2012 02:54, Marian Serban wrote:
>>
>> That's the kernel:
>>
>> Linux ro02xsrv003.bv.easic.ro 2.6.39.4 #6 SMP Mon Dec 12 12:09:49 EET
>> 2011 x86_64 x86_64 x86_64 GNU/Linux
>>
>> Anyway, I tried disabling the metaecc feature, no luck.
>>
>> [root at ro02xsrv003 ~]# tunefs.ocfs2 --fs-features=nometaecc
>> /dev/mapper/volgr1-lvol0
>> tunefs.ocfs2: I/O error on channel while opening device
>> "/dev/mapper/volgr1-lvol0"
>>
>> These are the last lines of strace corresponding to the tunefs.ocfs
>> command:
>>
>>
>>
>> open("/sys/fs/ocfs2/cluster_stack", O_RDONLY) = 4
>> fstat(4, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
>> = 0x7f54aad05000
>> read(4, "o2cb\n", 4096)                 = 5
>> close(4)                                = 0
>> munmap(0x7f54aad05000, 4096)            = 0
>> open("/sys/fs/o2cb/interface_revision", O_RDONLY) = 4
>> read(4, "5\n", 15)                      = 2
>> read(4, "", 13)                         = 0
>> close(4)                                = 0
>> stat("/sys/kernel/config", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
>> statfs("/sys/kernel/config", {f_type=0x62656570, f_bsize=4096,
>> f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0},
>> f_namelen=255, f_frsize=4096}) = 0
>> open("/dev/mapper/volgr1-lvol0", O_RDONLY) = 4
>> ioctl(4, BLKSSZGET, 0x7fffce711454)     = 0
>> close(4)                                = 0
>> pread(3, "\0\0\v\25\37\1\200\200\202@\21\2\30\26\0\0\0,\17\272\241\4\340\210\311\377\17\300\327\332\373\17"...,
>> 4096, 532480) = 4096
>> close(3)                                = 0
>> write(2, "tunefs.ocfs2", 12tunefs.ocfs2)            = 12
>> write(2, ": ", 2: )                       = 2
>> write(2, "I/O error on channel", 20I/O error on channel)    = 20
>> write(2, " ", 1 )                        = 1
>> write(2, "while opening device \"/dev/mappe"..., 47while opening device
>> "/dev/mapper/volgr1-lvol0") = 47
>> write(2, "\r\n", 2
>>
>>
>>
>>
>>
>> On 10.11.2012 02:06, Sunil Mushran wrote:
>>
>> It's either that or a check sum problem. Disable metaecc. Not sure which
>> kernel you are running.
>> We had fixed few problems few years ago around this. If your kernel is
>> older, then it could be
>> a known issue.
>>
>>
>> On Fri, Nov 9, 2012 at 12:50 PM, Marian Serban <marian at easic.ro> wrote:
>>
>>> Hi Sunil,
>>>
>>> Thank you for answering. Unfortunately, it doesn't seem like it's a
>>> hardware problem. There's no way a cable can be loose because it's iSCSI
>>> over 1G Ethernet (copper wires) environment. Also I performed "dd
>>> if=/dev/.... of=/dev/null" and first 16GB or so are fine. "Dmesg" shows no
>>> errors.
>>>
>>>
>>> Also tried with debugfs.ocfs2:
>>>
>>>
>>> [root at ro02xsrv003 ~]# debugfs.ocfs2  /dev/mapper/volgr1-lvol0
>>> debugfs.ocfs2 1.6.3
>>> debugfs: ls
>>> ls: Bad magic number in inode '.'
>>> debugfs: slotmap
>>> slotmap: Bad magic number in inode while reading slotmap system file
>>> debugfs: stats
>>>         Revision: 0.90
>>>         Mount Count: 0   Max Mount Count: 20
>>>         State: 0   Errors: 0
>>>         Check Interval: 0   Last Check: Fri Nov  9 14:35:53 2012
>>>         Creator OS: 0
>>>         Feature Compat: 3 backup-super strict-journal-super
>>>         Feature Incompat: 16208 sparse extended-slotmap inline-data
>>> metaecc xattr indexed-dirs refcount discontig-bg
>>>         Tunefs Incomplete: 0
>>>         Feature RO compat: 7 unwritten usrquota grpquota
>>>         Root Blknum: 129   System Dir Blknum: 130
>>>         First Cluster Group Blknum: 64
>>>         Block Size Bits: 12   Cluster Size Bits: 18
>>>         Max Node Slots: 10
>>>         Extended Attributes Inline Size: 256
>>>         Label: SAN
>>>         UUID: B4CF8D4667AF43118F3324567B90A987
>>>         Hash: 3698209293 (0xdc6e320d)
>>>         DX Seed[0]: 0x9f4a2bb7
>>>         DX Seed[1]: 0x501ddac0
>>>         DX Seed[2]: 0x6034bfe8
>>>         Cluster stack: classic o2cb
>>>         Inode: 2   Mode: 00   Generation: 1093568923 (0x412e899b)
>>>         FS Generation: 1093568923 (0x412e899b)
>>>         CRC32: 46f2d360   ECC: 04d4
>>>         Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
>>>         Dynamic Features: (0x0)
>>>         User: 0 (root)   Group: 0 (root)   Size: 0
>>>         Links: 0   Clusters: 45340448
>>>         ctime: 0x4ee67f67 -- Tue Dec 13 00:25:43 2011
>>>         atime: 0x0 -- Thu Jan  1 02:00:00 1970
>>>         mtime: 0x4ee67f67 -- Tue Dec 13 00:25:43 2011
>>>         dtime: 0x0 -- Thu Jan  1 02:00:00 1970
>>>         ctime_nsec: 0x00000000 -- 0
>>>         atime_nsec: 0x00000000 -- 0
>>>         mtime_nsec: 0x00000000 -- 0
>>>         Refcount Block: 0
>>>         Last Extblk: 0   Orphan Slot: 0
>>>         Sub Alloc Slot: Global   Sub Alloc Bit: 65535
>>>
>>>
>>>
>>>
>>> Marian
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20121109/ff2a902f/attachment.html 


More information about the Ocfs2-users mailing list