[Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6 (slighly OT)
Srinivas Eeda
srinivas.eeda at oracle.com
Wed Jul 10 10:55:35 PDT 2013
On 07/10/2013 10:24 AM, Ulf Zimmermann wrote:
>
> I will see what I can do. How large would a o2image be?
>
o2image only captures ocfs2 metadata so should be small.
o2image -r <dev> - | gzip > <dev>.img.gz
>
> To just reiterate, these are not new file systems. They were created
> with ocfs2-2.6.9-55.ELsmp-1.2.9-1.el4 and ocfs2-tools-1.2.7-1.el4
> under RHEL 4. The primary user of these volumes is a cluster of
> 6-nodes running RHEL 5.8 with ocfs2-2.6.18-308.11.1.el5-1.4.10-1 and
> ocfs2-tools-1.6.3-2.el5. Another machine, which still runs the same
> EL4 binaries, is mounting these snap cloned volumes daily, doing
> operations on the DB files and then copying the data off.
>
> *From:*Herbert van den Bergh [mailto:herbert.van.den.bergh at oracle.com]
> *Sent:* Wednesday, July 10, 2013 09:54
> *To:* Mihail Daskalov
> *Cc:* Sunil Mushran; Ulf Zimmermann; ocfs2-users at oss.oracle.com
> *Subject:* Re: [Ocfs2-users] Problems with volumes coming from RHEL5
> going to OEL6 (slighly OT)
>
> It's possible that the 1.8.0 tag was never created in the ocfs-tools
> git repository. But it's not of any use anyway. If you check the
> changelog of the ocfs-tools rpm, you'll see that there were many
> patches since 1.8.0, so the 1.8.0-10 version that Ulf is using would
> be very different from a 1.8.0 tag in git.
>
> Ulf, I suggest you create an o2image of the "bad" filesystem, and see
> if the problem can be reproduced with that image. If it can, then you
> may want to make that o2image available to the OCFS2 developers so
> they can debug ocfs2-tools to see what is causing the malloc/free
> error. You may also want to include the exact steps to take to
> reproduce this, starting from the mkfs up to the failure, indicating
> exactly what versions of kernel and tools were used along the way.
>
> Thanks,
> Herbert.
>
> On 7/10/13 7:55 AM, Mihail Daskalov wrote:
>
> Hi Sunil,
>
> Regarding the ocfs tools version 1.8.0 you should know best what
> it was meant to be (maybe not true for 1.8.0-10 in OEL6U3).
>
> Is it possible that the tag for 1.8.0 disappeared from the git
> repository? Or there was never a tag for 1.8.0 ?
>
> Bellow is the link to commit in 1.8.2 tag, that brings the version
> to 1.8.0
>
> https://oss.oracle.com/git/?p=ocfs2-tools.git;a=commitdiff;h=2480a215a600050d2bf923044dffac91439d982a;hp=8b5f4ad727e019cb557c4b516ab401c15c5c317e
>
> and later on another commit that bring the version to 1.8.2
>
> https://oss.oracle.com/git/?p=ocfs2-tools.git;a=commitdiff;h=560a1e60936fe868b00cfc9cad5def726e10828e
>
> I am sorry I am not actually helping to Ulf's problem.
>
> Ulf, maybe you can really follow the head version and try to see
> an explanation of the error message.
>
> Anyway I think it would be best to open a SR with Oracle if you
> have Linux support contract.
>
> Does anyone know how to find you the git repository at least for
> some packages in Oracle Linux. I know the source for each package
> is available as .src.rpm but how could I see the changes, or the
> tag from which every version was build?
>
> I remember Wim talking on something like that a while ago (saying
> oracle is not like redhat mangling changelogs), but I can't find
> the article right now.
>
> If you find out what is behind ocfs2-tools 1.8.0-10 it would be
> easier to track the problem.
>
> Regards,
>
> Mihail Daskalov
>
> *From:*ocfs2-users-bounces at oss.oracle.com
> <mailto:ocfs2-users-bounces at oss.oracle.com>
> [mailto:ocfs2-users-bounces at oss.oracle.com] *On Behalf Of *Sunil
> Mushran
> *Sent:* Wednesday, July 10, 2013 2:11 AM
> *To:* Ulf Zimmermann
> *Cc:* ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com>
> *Subject:* Re: [Ocfs2-users] Problems with volumes coming from
> RHEL5 going to OEL6
>
> The error does not make sense. Also I don't know what 1.8.0 tools
> means. I cannot see that label in the src tree.
> https://oss.oracle.com/git/?p=ocfs2-tools.git;a=summary
>
> One option is to build the tools from the head.
>
> On Tue, Jul 9, 2013 at 2:25 PM, Ulf Zimmermann <ulf at openlane.com
> <mailto:ulf at openlane.com>> wrote:
>
> Sunil, any suggestions on this?
>
> *From:*ocfs2-users-bounces at oss.oracle.com
> <mailto:ocfs2-users-bounces at oss.oracle.com>
> [mailto:ocfs2-users-bounces at oss.oracle.com
> <mailto:ocfs2-users-bounces at oss.oracle.com>] *On Behalf Of *Ulf
> Zimmermann
> *Sent:* Saturday, June 22, 2013 15:20
> *To:* Sunil Mushran
>
>
> *Cc:* ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com>
> *Subject:* Re: [Ocfs2-users] Problems with volumes coming from
> RHEL5 going to OEL6
>
> [root at co-db03 ulf]# debugfs.ocfs2 -R "stats"
> /dev/mapper/aucp_data_bk_2_x
>
> Revision: 0.90
>
> Mount Count: 0 Max Mount Count: 20
>
> State: 0 Errors: 0
>
> Check Interval: 0 Last Check: Sun Sep 25 05:32:29 2011
>
> Creator OS: 0
>
> Feature Compat: 0
>
> Feature Incompat: 0
>
> Tunefs Incomplete: 0
>
> Feature RO compat: 0
>
> Root Blknum: 513 System Dir Blknum: 514
>
> First Cluster Group Blknum: 256
>
> Block Size Bits: 12 Cluster Size Bits: 20
>
> Max Node Slots: 10
>
> Extended Attributes Inline Size: 0
>
> Label: /export/backuprecovery.AUCP
>
> UUID: 5F9C2727159743529200CE9C5E155562
>
> Hash: 0 (0x0)
>
> DX Seeds: 0 0 0 (0x00000000 0x00000000 0x00000000)
>
> Cluster stack: classic o2cb
>
> Cluster flags: 0
>
> Inode: 2 Mode: 00 Generation: 3147295185 <tel:3147295185>
> (0xbb97e9d1)
>
> FS Generation: 3147295185 <tel:3147295185> (0xbb97e9d1)
>
> CRC32: 00000000 ECC: 0000
>
> Type: Unknown Attr: 0x0 Flags: Valid System Superblock
>
> Dynamic Features: (0x0)
>
> User: 0 (root) Group: 0 (root) Size: 0
>
> Links: 0 Clusters: 1572864
>
> ctime: 0x4e7f1f5d 0x0 -- Sun Sep 25 05:32:29.0 2011
>
> atime: 0x0 0x0 -- Wed Dec 31 16:00:00.0 1969
>
> mtime: 0x4e7f1f5d 0x0 -- Sun Sep 25 05:32:29.0 2011
>
> dtime: 0x0 -- Wed Dec 31 16:00:00 1969
>
> Refcount Block: 0
>
> Last Extblk: 0 Orphan Slot: 0
>
> Sub Alloc Slot: Global Sub Alloc Bit: 65535
>
> *From:*Sunil Mushran [mailto:sunil.mushran at gmail.com
> <mailto:sunil.mushran at gmail.com>]
> *Sent:* Friday, June 21, 2013 11:11
> *To:* Ulf Zimmermann
> *Cc:* ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com>
> *Subject:* Re: [Ocfs2-users] Problems with volumes coming from
> RHEL5 going to OEL6
>
> Can you dump the following using the 1.8 binary.
> debugfs.ocfs2 -R "stats" /dev/mapper/.....
>
> On Fri, Jun 21, 2013 at 6:17 AM, Ulf Zimmermann <ulf at openlane.com
> <mailto:ulf at openlane.com>> wrote:
>
> We have a production cluster of 6 nodes, which are currently
> running RHEL 5.8 with OCFS2 1.4.10. We snapclone these volumes to
> multiple destinations, one of them is a RHEL4 machine with OCFS2
> 1.2.9. Because of that the volumes are set so that we can read
> them there.
>
> We are now trying to bring up a new server, this one has OEL 6.3
> on it and it comes with OCFS2 1.8.0 and tools 1.8.0-10. I can use
> tunefs.ocfs2 --cloned-volume to reset the UUID, but when I try to
> change the label I get:
>
> [root at co-db03 ulf]# tunefs.ocfs2 -L /export/backuprecovery.AUCP
> /dev/mapper/aucp_data_bk_2_x
>
> tunefs.ocfs2: Invalid name for a cluster while opening device
> "/dev/mapper/aucp_data_bk_2_x"
>
> fsck.ocfs2 core dumps with the following, I also filed a bug on
> Bugzilla for that:
>
> [root at co-db03 ulf]# fsck.ocfs2 /dev/mapper/aucp_data_bk_2_x
>
> fsck.ocfs2 1.8.0
>
> *** glibc detected *** fsck.ocfs2: double free or corruption
> (fasttop): 0x000000000197f320 ***
>
> ======= Backtrace: =========
>
> /lib64/libc.so.6[0x3656475366]
>
> fsck.ocfs2[0x434c31]
>
> fsck.ocfs2[0x403bc2]
>
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x365641ecdd]
>
> fsck.ocfs2[0x402879]
>
> ======= Memory map: ========
>
> 00400000-00450000 r-xp 00000000 fc:00 12489 /sbin/fsck.ocfs2
>
> 0064f000-00651000 rw-p 0004f000 fc:00 12489 /sbin/fsck.ocfs2
>
> 00651000-00652000 rw-p 00000000 00:00 0
>
> 00850000-00851000 rw-p 00050000 fc:00 12489 /sbin/fsck.ocfs2
>
> 0197e000-0199f000 rw-p 00000000 00:00 0 [heap]
>
> 3655c00000-3655c20000 r-xp 00000000 fc:00 8797 /lib64/ld-2.12.so
> <http://ld-2.12.so>
>
> 3655e1f000-3655e20000 r--p 0001f000 fc:00 8797 /lib64/ld-2.12.so
> <http://ld-2.12.so>
>
> 3655e20000-3655e21000 rw-p 00020000 fc:00 8797
> /lib64/ld-2.12.so <http://ld-2.12.so>
>
> 3655e21000-3655e22000 rw-p 00000000 00:00 0
>
> 3656400000-3656589000 r-xp 00000000 fc:00 8798 /lib64/libc-2.12.so
> <http://libc-2.12.so>
>
> 3656589000-3656788000 ---p 00189000 fc:00 8798 /lib64/libc-2.12.so
> <http://libc-2.12.so>
>
> 3656788000-365678c000 r--p 00188000 fc:00 8798 /lib64/libc-2.12.so
> <http://libc-2.12.so>
>
> 365678c000-365678d000 rw-p 0018c000 fc:00 8798 /lib64/libc-2.12.so
> <http://libc-2.12.so>
>
> 365678d000-3656792000 rw-p 00000000 00:00 0
>
> 3659c00000-3659c16000 r-xp 00000000 fc:00 8802
> /lib64/libgcc_s-4.4.6-20120305.so.1
>
> 3659c16000-3659e15000 ---p 00016000 fc:00 8802
> /lib64/libgcc_s-4.4.6-20120305.so.1
>
> 3659e15000-3659e16000 rw-p 00015000 fc:00 8802
> /lib64/libgcc_s-4.4.6-20120305.so.1
>
> 3d3e800000-3d3e817000 r-xp 00000000 fc:00 12028
> /lib64/libpthread-2.12.so <http://libpthread-2.12.so>
>
> 3d3e817000-3d3ea17000 ---p 00017000 fc:00 12028
> /lib64/libpthread-2.12.so
> <http://libpthread-2.12.so>
>
> 3d3ea17000-3d3ea18000 r--p 00017000 fc:00 12028
> /lib64/libpthread-2.12.so <http://libpthread-2.12.so>
>
> 3d3ea18000-3d3ea19000 rw-p 00018000 fc:00 12028
> /lib64/libpthread-2.12.so <http://libpthread-2.12.so>
>
> 3d3ea19000-3d3ea1d000 rw-p 00000000 00:00 0
>
> 3e26600000-3e26603000 r-xp 00000000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 3e26603000-3e26802000 ---p 00003000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 3e26802000-3e26803000 r--p 00002000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 3e26803000-3e26804000 rw-p 00003000 fc:00 426 /lib64/libcom_err.so.2.1
>
> 7fb063711000-7fb063714000 rw-p 00000000 00:00 0
>
> 7fb06371d000-7fb063720000 rw-p 00000000 00:00 0
>
> 7fffd5b95000-7fffd5bb6000 rw-p 00000000 00:00
> 0 [stack]
>
> 7fffd5bc5000-7fffd5bc6000 r-xp 00000000 00:00
> 0 [vdso]
>
> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00
> 0 [vsyscall]
>
> Abort (core dumped)
>
> I think one of the main question is what is the "Invalid name for
> a cluster while trying to join the group" or "Invalid name for a
> cluster while opening device". I am pretty sure that
> /etc/sysconfig/o2cb and /etc/ocfs2/cluster.conf is correct.
>
> Ulf.
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>
>
> _______________________________________________
>
> Ocfs2-users mailing list
>
> Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20130710/6a1d3451/attachment-0001.html
More information about the Ocfs2-users
mailing list