[Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6 (slighly OT)

Herbert van den Bergh herbert.van.den.bergh at oracle.com
Wed Jul 10 09:54:07 PDT 2013


It's possible that the 1.8.0 tag was never created in the ocfs-tools git 
repository.  But it's not of any use anyway.  If you check the changelog 
of the ocfs-tools rpm, you'll see that there were many patches since 
1.8.0, so the 1.8.0-10 version that Ulf is using would be very different 
from a 1.8.0 tag in git.

Ulf, I suggest you create an o2image of the "bad" filesystem, and see if 
the problem can be reproduced with that image.  If it can, then you may 
want to make that o2image available to the OCFS2 developers so they can 
debug ocfs2-tools to see what is causing the malloc/free error.  You may 
also want to include the exact steps to take to reproduce this, starting 
from the mkfs up to the failure, indicating exactly what versions of 
kernel and tools were used along the way.

Thanks,
Herbert.


On 7/10/13 7:55 AM, Mihail Daskalov wrote:
>
> Hi Sunil,
>
> Regarding the ocfs tools version 1.8.0 you should know best what it 
> was meant to be (maybe not true for 1.8.0-10 in OEL6U3).
>
> Is it possible that the tag for 1.8.0 disappeared from the git 
> repository? Or there was never a tag for 1.8.0 ?
>
> Bellow is the link to commit in 1.8.2 tag, that brings the version to 
> 1.8.0
>
> https://oss.oracle.com/git/?p=ocfs2-tools.git;a=commitdiff;h=2480a215a600050d2bf923044dffac91439d982a;hp=8b5f4ad727e019cb557c4b516ab401c15c5c317e
>
> and later on another commit that bring the version to 1.8.2
>
> https://oss.oracle.com/git/?p=ocfs2-tools.git;a=commitdiff;h=560a1e60936fe868b00cfc9cad5def726e10828e
>
> I am sorry I am not actually helping to Ulf's problem.
>
> Ulf, maybe you can really follow the head version and try to see an 
> explanation of the error message.
>
> Anyway I think it would be best to open a SR with Oracle if you have 
> Linux support contract.
>
> Does anyone know how to find you the git repository at least for some 
> packages in Oracle Linux. I know the source for each package is 
> available as .src.rpm but how could I see the changes, or the tag from 
> which every version was build?
>
> I remember Wim talking on something like that a while ago (saying 
>  oracle is not like redhat mangling changelogs), but I can't find the 
> article right now.
>
> If you find out what is behind ocfs2-tools 1.8.0-10 it would be easier 
> to track the problem.
>
> Regards,
>
> Mihail Daskalov
>
> *From:*ocfs2-users-bounces at oss.oracle.com 
> [mailto:ocfs2-users-bounces at oss.oracle.com] *On Behalf Of *Sunil Mushran
> *Sent:* Wednesday, July 10, 2013 2:11 AM
> *To:* Ulf Zimmermann
> *Cc:* ocfs2-users at oss.oracle.com
> *Subject:* Re: [Ocfs2-users] Problems with volumes coming from RHEL5 
> going to OEL6
>
> The error does not make sense. Also I don't know what 1.8.0 tools 
> means. I cannot see that label in the src tree.
> https://oss.oracle.com/git/?p=ocfs2-tools.git;a=summary
>
> One option is to build the tools from the head.
>
> On Tue, Jul 9, 2013 at 2:25 PM, Ulf Zimmermann <ulf at openlane.com 
> <mailto:ulf at openlane.com>> wrote:
>
> Sunil, any suggestions on this?
>
> *From:*ocfs2-users-bounces at oss.oracle.com 
> <mailto:ocfs2-users-bounces at oss.oracle.com> 
> [mailto:ocfs2-users-bounces at oss.oracle.com 
> <mailto:ocfs2-users-bounces at oss.oracle.com>] *On Behalf Of *Ulf Zimmermann
> *Sent:* Saturday, June 22, 2013 15:20
> *To:* Sunil Mushran
>
>
> *Cc:* ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com>
> *Subject:* Re: [Ocfs2-users] Problems with volumes coming from RHEL5 
> going to OEL6
>
> [root at co-db03 ulf]# debugfs.ocfs2 -R "stats" /dev/mapper/aucp_data_bk_2_x
>
>         Revision: 0.90
>
>         Mount Count: 0   Max Mount Count: 20
>
>         State: 0   Errors: 0
>
>         Check Interval: 0   Last Check: Sun Sep 25 05:32:29 2011
>
>         Creator OS: 0
>
>         Feature Compat: 0
>
>         Feature Incompat: 0
>
>         Tunefs Incomplete: 0
>
>         Feature RO compat: 0
>
>         Root Blknum: 513   System Dir Blknum: 514
>
>         First Cluster Group Blknum: 256
>
>         Block Size Bits: 12 Cluster Size Bits: 20
>
>         Max Node Slots: 10
>
>         Extended Attributes Inline Size: 0
>
>         Label: /export/backuprecovery.AUCP
>
>         UUID: 5F9C2727159743529200CE9C5E155562
>
>         Hash: 0 (0x0)
>
>         DX Seeds: 0 0 0 (0x00000000 0x00000000 0x00000000)
>
>         Cluster stack: classic o2cb
>
>         Cluster flags: 0
>
> Inode: 2   Mode: 00   Generation: 3147295185 <tel:3147295185> (0xbb97e9d1)
>
>         FS Generation: 3147295185 <tel:3147295185> (0xbb97e9d1)
>
> CRC32: 00000000   ECC: 0000
>
>         Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
>
>         Dynamic Features: (0x0)
>
>         User: 0 (root)   Group: 0 (root)   Size: 0
>
>         Links: 0   Clusters: 1572864
>
>         ctime: 0x4e7f1f5d 0x0 -- Sun Sep 25 05:32:29.0 2011
>
>         atime: 0x0 0x0 -- Wed Dec 31 16:00:00.0 1969
>
> mtime: 0x4e7f1f5d 0x0 -- Sun Sep 25 05:32:29.0 2011
>
> dtime: 0x0 -- Wed Dec 31 16:00:00 1969
>
>         Refcount Block: 0
>
>         Last Extblk: 0   Orphan Slot: 0
>
>         Sub Alloc Slot: Global Sub Alloc Bit: 65535
>
> *From:*Sunil Mushran [mailto:sunil.mushran at gmail.com 
> <mailto:sunil.mushran at gmail.com>]
> *Sent:* Friday, June 21, 2013 11:11
> *To:* Ulf Zimmermann
> *Cc:* ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com>
> *Subject:* Re: [Ocfs2-users] Problems with volumes coming from RHEL5 
> going to OEL6
>
> Can you dump the following using the 1.8 binary.
> debugfs.ocfs2 -R "stats" /dev/mapper/.....
>
> On Fri, Jun 21, 2013 at 6:17 AM, Ulf Zimmermann <ulf at openlane.com 
> <mailto:ulf at openlane.com>> wrote:
>
> We have a production cluster of 6 nodes, which are currently running 
> RHEL 5.8 with OCFS2 1.4.10. We snapclone these volumes to multiple 
> destinations, one of them is a RHEL4 machine with OCFS2 1.2.9. Because 
> of that the volumes are set so that we can read them there.
>
> We are now trying to bring up a new server, this one has OEL 6.3 on it 
> and it comes with OCFS2 1.8.0 and tools 1.8.0-10. I can use 
> tunefs.ocfs2 --cloned-volume to reset the UUID, but when I try to 
> change the label I get:
>
> [root at co-db03 ulf]# tunefs.ocfs2 -L /export/backuprecovery.AUCP 
> /dev/mapper/aucp_data_bk_2_x
>
> tunefs.ocfs2: Invalid name for a cluster while opening device 
> "/dev/mapper/aucp_data_bk_2_x"
>
> fsck.ocfs2 core dumps with the following, I also filed a bug on 
> Bugzilla for that:
>
> [root at co-db03 ulf]# fsck.ocfs2 /dev/mapper/aucp_data_bk_2_x
>
> fsck.ocfs2 1.8.0
>
> *** glibc detected *** fsck.ocfs2: double free or corruption 
> (fasttop): 0x000000000197f320 ***
>
> ======= Backtrace: =========
>
> /lib64/libc.so.6[0x3656475366]
>
> fsck.ocfs2[0x434c31]
>
> fsck.ocfs2[0x403bc2]
>
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x365641ecdd]
>
> fsck.ocfs2[0x402879]
>
> ======= Memory map: ========
>
> 00400000-00450000 r-xp 00000000 fc:00 12489 /sbin/fsck.ocfs2
>
> 0064f000-00651000 rw-p 0004f000 fc:00 12489 /sbin/fsck.ocfs2
>
> 00651000-00652000 rw-p 00000000 00:00 0
>
> 00850000-00851000 rw-p 00050000 fc:00 12489 /sbin/fsck.ocfs2
>
> 0197e000-0199f000 rw-p 00000000 00:00 0     [heap]
>
> 3655c00000-3655c20000 r-xp 00000000 fc:00 8797 /lib64/ld-2.12.so 
> <http://ld-2.12.so>
>
> 3655e1f000-3655e20000 r--p 0001f000 fc:00 8797 /lib64/ld-2.12.so 
> <http://ld-2.12.so>
>
> 3655e20000-3655e21000 rw-p 00020000 fc:00 8797 
>           /lib64/ld-2.12.so <http://ld-2.12.so>
>
> 3655e21000-3655e22000 rw-p 00000000 00:00 0
>
> 3656400000-3656589000 r-xp 00000000 fc:00 8798 /lib64/libc-2.12.so 
> <http://libc-2.12.so>
>
> 3656589000-3656788000 ---p 00189000 fc:00 8798 /lib64/libc-2.12.so 
> <http://libc-2.12.so>
>
> 3656788000-365678c000 r--p 00188000 fc:00 8798 /lib64/libc-2.12.so 
> <http://libc-2.12.so>
>
> 365678c000-365678d000 rw-p 0018c000 fc:00 8798 /lib64/libc-2.12.so 
> <http://libc-2.12.so>
>
> 365678d000-3656792000 rw-p 00000000 00:00 0
>
> 3659c00000-3659c16000 r-xp 00000000 fc:00 8802 
> /lib64/libgcc_s-4.4.6-20120305.so.1
>
> 3659c16000-3659e15000 ---p 00016000 fc:00 8802 
> /lib64/libgcc_s-4.4.6-20120305.so.1
>
> 3659e15000-3659e16000 rw-p 00015000 fc:00 8802 
> /lib64/libgcc_s-4.4.6-20120305.so.1
>
> 3d3e800000-3d3e817000 r-xp 00000000 fc:00 12028 
> /lib64/libpthread-2.12.so <http://libpthread-2.12.so>
>
> 3d3e817000-3d3ea17000 ---p 00017000 fc:00 12028 
>                          /lib64/libpthread-2.12.so 
> <http://libpthread-2.12.so>
>
> 3d3ea17000-3d3ea18000 r--p 00017000 fc:00 12028 
> /lib64/libpthread-2.12.so <http://libpthread-2.12.so>
>
> 3d3ea18000-3d3ea19000 rw-p 00018000 fc:00 12028 
> /lib64/libpthread-2.12.so <http://libpthread-2.12.so>
>
> 3d3ea19000-3d3ea1d000 rw-p 00000000 00:00 0
>
> 3e26600000-3e26603000 r-xp 00000000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 3e26603000-3e26802000 ---p 00003000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 3e26802000-3e26803000 r--p 00002000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 3e26803000-3e26804000 rw-p 00003000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 7fb063711000-7fb063714000 rw-p 00000000 00:00 0
>
> 7fb06371d000-7fb063720000 rw-p 00000000 00:00 0
>
> 7fffd5b95000-7fffd5bb6000 rw-p 00000000 00:00 
> 0                          [stack]
>
> 7fffd5bc5000-7fffd5bc6000 r-xp 00000000 00:00 
> 0                          [vdso]
>
> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 
> 0                  [vsyscall]
>
> Abort (core dumped)
>
> I think one of the main question is what is the "Invalid name for a 
> cluster while trying to join the group" or "Invalid name for a cluster 
> while opening device". I am pretty sure that /etc/sysconfig/o2cb and 
> /etc/ocfs2/cluster.conf is correct.
>
> Ulf.
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20130710/d056dc04/attachment-0001.html 


More information about the Ocfs2-users mailing list