[Ocfs-users] Server crashed with Common/ocfsgencreate.c, Common/ocfsgenvote.c

Sunil Mushran Sunil.Mushran at oracle.com
Wed Dec 21 11:34:22 CST 2005


Yes, this issue has been resolved.

1.0.13 does not require a kernel upgrade. 1.0.14 does.
The only difference between the two is that aio is supported
in the latter. So go with 1.0.13. But yes, you will need to
shutdown the db as rolling upgrade is not recommended
from 1.0.9-x to 1.0.10+.

Ivan Wong wrote:

>Hi Sunil,
>
>Thanks for responding.
>
>We would really love to upgrade the OCFS version. However, to get to
>1.0.13/14, the kernel version will have to be upgraded as well. We are
>running a 7x24 system, considering the downtime and risk we will do
>upgrade as last resort.
>
>So far we still can live with this version...until this happen and we
>want to confirm if it is OCFS bug before proceed with upgrade? Will the
>upgrade (with risk/downtime) definitely will fix the problem?
>
>Thanks / regards,
> 
>Ivan Wong
>Database Administrator
> 
>e2Open Inc. (www.e2open.com) 
>Suite 34.03, Level 34, 
>Menara Citibank,
>156, Jalan Ampang,
>50450 Kuala Lumpur, Malaysia
>DID: +603 2776 6397 
>Tel: +603 2776 6300 
>Fax: +603 2712 9112
>
>
>-----Original Message-----
>From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com] 
>Sent: Wednesday, December 21, 2005 1:15 AM
>To: Ivan Wong
>Cc: ocfs-users at oss.oracle.com
>Subject: Re: [Ocfs-users] Server crashed with Common/ocfsgencreate.c,
>Common/ocfsgenvote.c
>
>
>1.0.9-9? You are running a very very old verion of ocfs.
>Please upgrade to atleast 1.0.13 if not 1.0.14. The README
>has the list of bugs fixed.
>
>While the disk format has not changed, that means you can
>just upgrade the rpm and start, you will have to umount the volumes on
>all nodes before installing. README lists this requirement when
>upgrading from 1.0.9-x to 1.0.10 or more.
>
>Ivan Wong wrote:
>
>  
>
>>Sorry for the non-read friendly email. This week we had another crash, 
>>the error is below:
>>
>>Dec 18 14:16:26 x335-149 kernel: (5694) ERROR: status = -17, 
>>Common/ocfsgencreate.c, 1671 Dec 18 14:16:26 x335-149 kernel: (5694) 
>>ERROR: status = -17, Common/ocfsgencreate.c, 1827
>>Dec 18 14:16:26 x335-149 kernel: (5694) ERROR: status = -17,
>>Linux/ocfsmain.c, 2090
>>Dec 18 14:16:26 x335-149 kernel: (5694) ERROR: status = -17,
>>Linux/ocfsmain.c, 2409
>>Dec 18 14:16:28 x335-149 kernel: (5695) ERROR: status = -2,
>>Common/ocfsgendlm.c, 986
>>Dec 18 14:16:28 x335-149 kernel: (5695) ERROR: status = -2,
>>Common/ocfsgendlm.c, 1163
>>Dec 18 14:16:28 x335-149 kernel: (5695) ERROR: status = -2,
>>Common/ocfsgencreate.c, 479
>>Dec 18 14:16:28 x335-149 kernel: (5695) ERROR: status = -2,
>>Linux/ocfsmain.c, 2030
>>Dec 18 16:33:10 x335-149 kernel: (10) ERROR: lockres=null,
>>Linux/ocfsmain.c, 3541
>>Dec 18 18:20:38 x335-149 kernel: (10) ERROR: lockres=null,
>>Linux/ocfsmain.c, 3541
>>Dec 19 00:02:30 x335-149 last message repeated 3 times
>>Dec 19 00:26:24 x335-149 sshd(pam_unix)[17895]: session opened for user
>>oracle by (uid=0)
>>Dec 19 00:26:49 x335-149 sshd(pam_unix)[17895]: session closed for user
>>oracle
>>Dec 19 04:02:05 x335-149 syslogd 1.4.1: restart.
>>Dec 19 05:45:03 x335-149 kernel: (26188) ERROR: status = -17,
>>Common/ocfsgendirnode.c, 1507
>>Dec 19 05:45:03 x335-149 kernel: (26188) ERROR: status = -17,
>>Common/ocfsgencreate.c, 1671
>>Dec 19 05:45:03 x335-149 kernel: (26188) ERROR: status = -17,
>>Common/ocfsgencreate.c, 1827
>>Dec 19 05:45:03 x335-149 kernel: (26188) ERROR: status = -17,
>>Linux/ocfsmain.c, 2090
>>Dec 19 05:45:03 x335-149 kernel: (26188) ERROR: status = -17,
>>Linux/ocfsmain.c, 2409
>>Dec 19 06:48:08 x335-149 kernel: (10) ERROR: lockres=null,
>>Linux/ocfsmain.c, 3541
>>Dec 19 07:15:06 x335-149 kernel: Unable to handle kernel NULL pointer
>>dereference<3>(3) ERROR: oin has no matching inode!!!!,
>>Common/ocfsgencreate.c, 81
>>Dec 19 09:00:03 x335-149 syslogd 1.4.1: restart.
>>Dec 19 09:00:03 x335-149 syslog: syslogd startup succeeded
>>Dec 19 09:00:04 x335-149 kernel: klogd 1.4.1, log source = /proc/kmsg
>>started.
>>Dec 19 09:00:04 x335-149 syslog: klogd startup succeeded
>>
>>
>>Note the part that says : "oin has no matching inode" is particulary 
>>link to OCFS. Wim? Sunil? Pls anyone advice.
>>
>>
>>Thanks / regards,
>>
>>Ivan Wong
>>Database Administrator
>>
>>e2Open Inc. (www.e2open.com)
>>Suite 34.03, Level 34, 
>>Menara Citibank,
>>156, Jalan Ampang,
>>50450 Kuala Lumpur, Malaysia
>>DID: +603 2776 6397 
>>Tel: +603 2776 6300 
>>Fax: +603 2712 9112
>>
>>
>>-----Original Message-----
>>From: ocfs-users-bounces at oss.oracle.com 
>>[mailto:ocfs-users-bounces at oss.oracle.com] On Behalf Of Ivan Wong
>>Sent: Tuesday, December 13, 2005 3:56 PM
>>To: ocfs-users at oss.oracle.com
>>Subject: [Ocfs-users] Server crashed with 
>>Common/ocfsgencreate.c,Common/ocfsgenvote.c
>>
>>
>>Hi Experts,
>>
>>We have a 4nodes RAC running and recently one is down due to hardware 
>>(fibre optics card) failure. Since running on 3-nodes RAC, the 
>>surviving server just keep crashing. We cannot figure out why is this 
>>happening but checking /var/log/messages we have these error (notice 
>>the msg before crashing at 8:32):
>>
>>Dec 12 08:30:45 x335-142 kernel: (2) ERROR: file entry name did not 
>>match inode, Common/ocfsgencreate.c, 97 Dec 12 08:30:45 x335-142 
>>kernel:
>>(2) ERROR: status = -2, Common/ocfsgenvote.c, 121 Dec 12 08:32:28
>>x335-142 kernel: (2) ERROR: file entry name did not match inode,
>>Common/ocfsgencreate.c, 97 Dec 12 08:32:28 x335-142 kernel: (2) ERROR:
>>status = -2, Common/ocfsgenvote.c, 121 Dec 12 08:46:35 x335-142
>>sshd(pam_unix)[9468]: session opened for user oracle by (uid=0) Dec 12
>>08:55:30 x335-142 xinetd[11044]: warning: can't get client
>>address: Connection reset by peer
>>Dec 12 09:59:11 x335-142 sshd(pam_unix)[9468]: session closed for user
>>oracle Dec 12 15:15:48 x335-142 kernel: (4) ERROR: file entry name did
>>not match inode, Common/ocfsgencreate.c, 97 Dec 12 15:15:48 x335-142
>>kernel: (4) ERROR: status = -2, Common/ocfsgenvote.c, 121 Dec 12
>>16:16:10 x335-142 kernel: (3) ERROR: file entry name did not match
>>inode, Common/ocfsgencreate.c, 97 Dec 12 16:16:10 x335-142 kernel: (3)
>>ERROR: status = -2, Common/ocfsgenvote.c, 121 Dec 12 16:16:20 x335-142
>>kernel: (3) ERROR: file entry name did not match inode,
>>Common/ocfsgencreate.c, 97 Dec 12 16:16:20 x335-142 kernel: (3) ERROR:
>>status = -2, Common/ocfsgenvote.c, 121 Dec 12 16:16:30 x335-142 kernel:
>>(3) ERROR: file entry name did not match inode, Common/ocfsgencreate.c,
>>97 Dec 12 16:16:30 x335-142 kernel: (3) ERROR: status = -2,
>>Common/ocfsgenvote.c, 121 Dec 12 16:16:35 x335-142 kernel: (3) ERROR:
>>file entry name did not match inode, Common/ocfsgencreate.c, 97
>>
>>
>>Power cycle the box will allow us to continue starting db, etc. But 
>>this is the 4th time in two weeks. Since the only error found is ocfs, 
>>just wondering if anyone have seen this? Or if it is OCFS related.
>>
>>Our environment is:
>>
>>x335-142:slr142:/e2open/home/oracle: 1000>rpm -qa | grep ocfs 
>>ocfs-2.4.9-e-smp-1.0.9-9 ocfs-support-1.0.9-9 ocfs-tools-1.0.9-9
>>x335-142:slr142:/e2open/home/oracle: 1001>uname -a
>>Linux x335-142 2.4.9-e.25smp #1 SMP Fri Jun 6 18:11:40 EDT 2003 i686 
>>unknown
>>
>>Appreciate any feedback.
>>
>>
>>
>>Thanks / regards,
>>
>>Ivan Wong
>>Database Administrator
>>
>>e2Open Inc. (www.e2open.com)
>>Suite 34.03, Level 34, Menara Citibank
>>156, Jalan Ampang,
>>50450 Kuala Lumpur, Malaysia
>>DID: +603 2776 6397 Tel: +603 2776 6300 Fax: +603 2712 9112
>>_______________________________________________
>>Ocfs-users mailing list
>>Ocfs-users at oss.oracle.com
>>http://oss.oracle.com/mailman/listinfo/ocfs-users
>>_______________________________________________
>>Ocfs-users mailing list
>>Ocfs-users at oss.oracle.com
>>http://oss.oracle.com/mailman/listinfo/ocfs-users
>> 
>>
>>    
>>
>_______________________________________________
>Ocfs-users mailing list
>Ocfs-users at oss.oracle.com
>http://oss.oracle.com/mailman/listinfo/ocfs-users
>  
>


More information about the Ocfs-users mailing list