[Ocfs2-users] Huge Problem ocfs2

Laurentiu Gosu lg at easic.ro
Sun Nov 11 15:24:30 PST 2012


Hi,
We managed to track down the problem: the inodes which hold the 
RootDirectory and System Directory(and probably others ..like hb) were 
overwritten somehow(!?).
Using debugfs and a lot of detective work Marian found the inode number 
of one of the sub-folders and then we cd .. until the most top level 
reachable folder...and then used rdump to recover the data.
Now the question is why the critical blocks were overwritten. Maybe you 
can help to track this down and correct it(if that's the case). So some 
facts from 2 days ago:
1. ocfs2 cluster started becoming unresponsive(could not ls on some folders)
2. we unmounted the device from all nodes and run a fscheck -y on it(few 
months ago we did this succesfully)
3. after succesfully finished fscheck i remounted the device on all 5 nodes.
4. after 1 hour all nodes started reporting in syslog something like:
*Nov  9 15:40:17 ro02xsrv003 kernel: 
(o2hb-B4CF8D4667,6098,9):o2hb_check_last_timestamp:576 ERROR: Another 
node is heartbeating on device (dm-5): expected(2:0xdfd1f518e3333501, 
0x509d07bf), ondisk(1:0xd81cb80a00020069, 0xac1bf00000001db8)**
**Nov  9 15:40:17 ro02xsrv003 kernel: 
(o2hb-B4CF8D4667,6098,9):o2hb_check_slot:802 ERROR: Node 0 has written a 
bad crc to dm-5**
**Nov  9 15:40:17 ro02xsrv003 kernel: 
(o2hb-B4CF8D4667,6098,9):o2hb_dump_slot:526 ERROR: Dump slot 
information: seq = 0x2c2527fa66646f6d, node = 37, cksum = 0xda52, 
generation 0xf7a004a5a8c00000**
**Nov  9 15:40:17 ro02xsrv003 kernel: 
(o2hb-B4CF8D4667,6098,9):o2hb_check_slot:802 ERROR: Node 3 has written a 
bad crc to dm-5*

So i believe the fscheck marked somehow the meta-data blocks as writable 
and when they were used....kaboom.
Hope it helps somebody to find the root cause. If additional info are 
needed for debugging let me know.
Thanks,
Laurentiu.


On 11/10/2012 04:25, Marian Serban wrote:
> debugfs: ls /
> ls: Bad magic number in inode while checking directory at block 129
>
>
>
> On 10.11.2012 04:24, Sunil Mushran wrote:
>> Yes that should be enough for that. But that won't help if the real 
>> problem is device related.
>>
>> What does debugfs.ocfs2 -R "ls -l /" return? If that errors, means 
>> the root dir is gone. Maybe
>> best to look into your backups.
>>
>>
>> On Fri, Nov 9, 2012 at 6:01 PM, Marian Serban <marian at easic.ro 
>> <mailto:marian at easic.ro>> wrote:
>>
>>     Nope, rdump doesn't work either.
>>
>>     debugfs: rdump -v / /tmp
>>     Copying to /tmp/
>>     rdump: Bad magic number in inode while reading inode 129
>>     rdump: Bad magic number in inode while recursively dumping inode 129
>>
>>
>>     Could you please confirm that it's enough to just force the
>>     return value of 0 at "ocfs2_validate_meta_ecc" in order to bypass
>>     the ECC checks?
>>
>>
>>
>>
>>     On 10.11.2012 03:55, Sunil Mushran wrote:
>>>     If global bitmap is gone. then the fs is unusable. But you can
>>>     extract data using
>>>     the rdump command in debugfs.ocfs. The success depends on how
>>>     much of the
>>>     device is still usable.
>>>
>>>
>>>     On Fri, Nov 9, 2012 at 5:50 PM, Marian Serban <marian at easic.ro
>>>     <mailto:marian at easic.ro>> wrote:
>>>
>>>         I tried hacking the fsck.ocfs2 source code by not
>>>         considering metaecc flag. Then I ran into
>>>
>>>         journal recovery: Bad magic number in inode while looking up
>>>         the journal inode for slot 0
>>>
>>>         fsck encountered unrecoverable errors while replaying the
>>>         journals and will not continue
>>>
>>>         After bypassing journal replay function, I got
>>>
>>>         Pass 0a: Checking cluster allocation chains
>>>         pass0: Bad magic number in inode while looking up the global
>>>         bitmap inode
>>>         fsck.ocfs2: Bad magic number in inode while performing pass 0
>>>
>>>
>>>         Does it mean the filesystem is destroyed completely?
>>>
>>>
>>>
>>>
>>>         On 10.11.2012 02:54, Marian Serban wrote:
>>>>         That's the kernel:
>>>>
>>>>         Linux ro02xsrv003.bv.easic.ro
>>>>         <http://ro02xsrv003.bv.easic.ro> 2.6.39.4 #6 SMP Mon Dec 12
>>>>         12:09:49 EET 2011 x86_64 x86_64 x86_64 GNU/Linux
>>>>
>>>>         Anyway, I tried disabling the metaecc feature, no luck.
>>>>
>>>>         [root at ro02xsrv003 ~]# tunefs.ocfs2 --fs-features=nometaecc
>>>>         /dev/mapper/volgr1-lvol0
>>>>         tunefs.ocfs2: I/O error on channel while opening device
>>>>         "/dev/mapper/volgr1-lvol0"
>>>>
>>>>         These are the last lines of strace corresponding to the
>>>>         tunefs.ocfs command:
>>>>
>>>>
>>>>
>>>>         open("/sys/fs/ocfs2/cluster_stack", O_RDONLY) = 4
>>>>         fstat(4, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
>>>>         mmap(NULL, 4096, PROT_READ|PROT_WRITE,
>>>>         MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f54aad05000
>>>>         read(4, "o2cb\n", 4096)                 = 5
>>>>         close(4) = 0
>>>>         munmap(0x7f54aad05000, 4096)            = 0
>>>>         open("/sys/fs/o2cb/interface_revision", O_RDONLY) = 4
>>>>         read(4, "5\n", 15)                      = 2
>>>>         read(4, "", 13)                         = 0
>>>>         close(4) = 0
>>>>         stat("/sys/kernel/config", {st_mode=S_IFDIR|0755,
>>>>         st_size=0, ...}) = 0
>>>>         statfs("/sys/kernel/config", {f_type=0x62656570,
>>>>         f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0,
>>>>         f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
>>>>         open("/dev/mapper/volgr1-lvol0", O_RDONLY) = 4
>>>>         ioctl(4, BLKSSZGET, 0x7fffce711454)     = 0
>>>>         close(4) = 0
>>>>         pread(3,
>>>>         "\0\0\v\25\37\1\200\200\202@\21\2\30\26\0\0\0,\17\272\241\4\340\210\311\377\17\300\327\332\373\17"...,
>>>>         4096, 532480) = 4096
>>>>         close(3) = 0
>>>>         write(2, "tunefs.ocfs2", 12tunefs.ocfs2)            = 12
>>>>         write(2, ": ", 2: )                       = 2
>>>>         write(2, "I/O error on channel", 20I/O error on channel)   
>>>>         = 20
>>>>         write(2, " ", 1 )                        = 1
>>>>         write(2, "while opening device \"/dev/mappe"..., 47while
>>>>         opening device "/dev/mapper/volgr1-lvol0") = 47
>>>>         write(2, "\r\n", 2
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         On 10.11.2012 02:06, Sunil Mushran wrote:
>>>>>         It's either that or a check sum problem. Disable metaecc.
>>>>>         Not sure which kernel you are running.
>>>>>         We had fixed few problems few years ago around this. If
>>>>>         your kernel is older, then it could be
>>>>>         a known issue.
>>>>>
>>>>>
>>>>>         On Fri, Nov 9, 2012 at 12:50 PM, Marian Serban
>>>>>         <marian at easic.ro <mailto:marian at easic.ro>> wrote:
>>>>>
>>>>>             Hi Sunil,
>>>>>
>>>>>             Thank you for answering. Unfortunately, it doesn't
>>>>>             seem like it's a hardware problem. There's no way a
>>>>>             cable can be loose because it's iSCSI over 1G Ethernet
>>>>>             (copper wires) environment. Also I performed "dd
>>>>>             if=/dev/.... of=/dev/null" and first 16GB or so are
>>>>>             fine. "Dmesg" shows no errors.
>>>>>
>>>>>
>>>>>             Also tried with debugfs.ocfs2:
>>>>>
>>>>>
>>>>>             [root at ro02xsrv003 ~]# debugfs.ocfs2
>>>>>              /dev/mapper/volgr1-lvol0
>>>>>             debugfs.ocfs2 1.6.3
>>>>>             debugfs: ls
>>>>>             ls: Bad magic number in inode '.'
>>>>>             debugfs: slotmap
>>>>>             slotmap: Bad magic number in inode while reading
>>>>>             slotmap system file
>>>>>             debugfs: stats
>>>>>                     Revision: 0.90
>>>>>                     Mount Count: 0   Max Mount Count: 20
>>>>>                     State: 0   Errors: 0
>>>>>                     Check Interval: 0 Last Check: Fri Nov  9
>>>>>             14:35:53 2012
>>>>>                     Creator OS: 0
>>>>>                     Feature Compat: 3 backup-super
>>>>>             strict-journal-super
>>>>>                     Feature Incompat: 16208 sparse
>>>>>             extended-slotmap inline-data metaecc xattr
>>>>>             indexed-dirs refcount discontig-bg
>>>>>                     Tunefs Incomplete: 0
>>>>>                     Feature RO compat: 7 unwritten usrquota grpquota
>>>>>                     Root Blknum: 129 System Dir Blknum: 130
>>>>>                     First Cluster Group Blknum: 64
>>>>>                     Block Size Bits: 12   Cluster Size Bits: 18
>>>>>                     Max Node Slots: 10
>>>>>                     Extended Attributes Inline Size: 256
>>>>>                     Label: SAN
>>>>>                     UUID: B4CF8D4667AF43118F3324567B90A987
>>>>>                     Hash: 3698209293 (0xdc6e320d)
>>>>>                     DX Seed[0]: 0x9f4a2bb7
>>>>>                     DX Seed[1]: 0x501ddac0
>>>>>                     DX Seed[2]: 0x6034bfe8
>>>>>                     Cluster stack: classic o2cb
>>>>>                     Inode: 2   Mode: 00   Generation: 1093568923
>>>>>             (0x412e899b)
>>>>>                     FS Generation: 1093568923 (0x412e899b)
>>>>>                     CRC32: 46f2d360 ECC: 04d4
>>>>>                     Type: Unknown Attr: 0x0   Flags: Valid System
>>>>>             Superblock
>>>>>                     Dynamic Features: (0x0)
>>>>>                     User: 0 (root) Group: 0 (root)   Size: 0
>>>>>                     Links: 0   Clusters: 45340448
>>>>>                     ctime: 0x4ee67f67 -- Tue Dec 13 00:25:43 2011
>>>>>                     atime: 0x0 -- Thu Jan  1 02:00:00 1970
>>>>>                     mtime: 0x4ee67f67 -- Tue Dec 13 00:25:43 2011
>>>>>                     dtime: 0x0 -- Thu Jan  1 02:00:00 1970
>>>>>                     ctime_nsec: 0x00000000 -- 0
>>>>>                     atime_nsec: 0x00000000 -- 0
>>>>>                     mtime_nsec: 0x00000000 -- 0
>>>>>                     Refcount Block: 0
>>>>>                     Last Extblk: 0 Orphan Slot: 0
>>>>>                     Sub Alloc Slot: Global   Sub Alloc Bit: 65535
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             Marian
>>>>>
>>>>>
>>>>>             _______________________________________________
>>>>>             Ocfs2-users mailing list
>>>>>             Ocfs2-users at oss.oracle.com
>>>>>             <mailto:Ocfs2-users at oss.oracle.com>
>>>>>             https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20121112/28ddaf05/attachment-0001.html 


More information about the Ocfs2-users mailing list