[Ocfs2-users] OCFS2 Caused RAC server to crash

Wed Jun 17 08:43:25 PDT 2009

Since you are using Oracle RAC with OCFS2, you can request support for 
OCFS2 via Metalink using your Oracle CSI. 

Thanks,
Herbert,

Sunil Mushran wrote:
> Please file a bugzilla and _attach_ this oops trace. Also mention all 
> the version numbers.
>
>
> On Jun 17, 2009, at 2:30 AM, "McDonald, Stuart" 
> <smcdonald at uk.sopragroup.com <mailto:smcdonald at uk.sopragroup.com>> wrote:
>
>> Hi
>>
>> We have a two-node RAC cluster, which uses ASM for the database 
>> storage, but is using OCFS2 to mount a couple of file systems for a) 
>> the RMAN backups, and b) Oracle Data files, i.e. files read from or 
>> written to DBA Directories.
>>
>> One of the servers in the cluster crashed, and checks revealed the 
>> following error messages:
>>
>> Jun 5 17:00:27 cadbbe2 kernel: Unable to handle kernel NULL pointer 
>> dereference at virtual address 00000bc4
>> Jun 5 17:00:27 cadbbe2 kernel: printing eip:
>> Jun 5 17:00:27 cadbbe2 kernel: f9125eb2
>> Jun 5 17:00:27 cadbbe2 kernel: *pde = 09ac4001
>> Jun 5 17:00:27 cadbbe2 kernel: Oops: 0002 [#1]
>> Jun 5 17:00:27 cadbbe2 kernel: SMP
>> Jun 5 17:00:27 cadbbe2 kernel: Modules linked in: hangcheck_timer 
>> parport_pc lp parport oracleasm(U) autofs4 i2c_dev i2c_core ocfs$
>>
>> Jun 5 17:00:27 cadbbe2 kernel: CPU: 3
>> Jun 5 17:00:27 cadbbe2 kernel: EIP: 0060:[<f9125eb2>] Tainted: P VLI
>> Jun 5 17:00:27 cadbbe2 kernel: EFLAGS: 00010246 (2.6.9-42.ELsmp)
>> Jun 5 17:00:27 cadbbe2 kernel: EIP is at 
>> ocfs2_free_suballoc_bits+0x4ca/0x766 [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: eax: 00000000 ebx: 00000000 ecx: 
>> 0000000b edx: 00000000
>> Jun 5 17:00:27 cadbbe2 kernel: esi: 00005c28 edi: 000000ff ebp: 
>> d828b000 esp: d643cdb0
>> Jun 5 17:00:27 cadbbe2 kernel: ds: 007b es: 007b ss: 0068
>> Jun 5 17:00:27 cadbbe2 kernel: Process rm (pid: 27647, 
>> threadinfo=d643c000 task=d520f6b0)
>> Jun 5 17:00:27 cadbbe2 kernel: Stack: 0000000b 00000000 00000000 
>> 00000001 f19e9d48 f2fa80c0 f2fa8000 d721489c
>> Jun 5 17:00:27 cadbbe2 kernel: f301a728 f6e2fa40 f19e9d48 f3dd5880 
>> 0000000c 00000000 f3dd5880 f912646f
>> Jun 5 17:00:27 cadbbe2 kernel: 00005b29 0d478a00 00000000 00000100 
>> 0d47e529 f6d60c00 d721489c f301a728
>> Jun 5 17:00:27 cadbbe2 kernel: Call Trace:
>> Jun 5 17:00:27 cadbbe2 kernel: [<f912646f>] 
>> ocfs2_free_clusters+0x2ad/0x37a [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<f90f872d>] 
>> ocfs2_replay_truncate_records+0x2e4/0x3c7 [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<f90f8b80>] 
>> __ocfs2_flush_truncate_log+0x370/0x45b [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<f90fad8d>] 
>> ocfs2_commit_truncate+0x508/0x8a3 [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<f910f347>] 
>> ocfs2_truncate_for_delete+0x255/0x312 [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<f910fb54>] 
>> ocfs2_wipe_inode+0x1aa/0x2ee [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<f9110555>] 
>> ocfs2_delete_inode+0x2d6/0x450 [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<f911027f>] 
>> ocfs2_delete_inode+0x0/0x450 [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<c0171954>] 
>> generic_delete_inode+0xa2/0x104
>> Jun 5 17:00:27 cadbbe2 kernel: [<f9111407>] 
>> ocfs2_drop_inode+0xe6/0x12a [ocfs2]
>> Jun 5 17:00:27 cadbbe2 kernel: [<c0171b30>] iput+0x5f/0x61
>> Jun 5 17:00:27 cadbbe2 kernel: [<c0168e46>] sys_unlink+0xd7/0x132
>> Jun 5 17:00:27 cadbbe2 kernel: [<c015bcbc>] fget+0x3b/0x42
>> Jun 5 17:00:27 cadbbe2 kernel: [<c016add6>] sys_ioctl+0x227/0x269
>> Jun 5 17:00:27 cadbbe2 kernel: [<c016ae0c>] sys_ioctl+0x25d/0x269
>> Jun 5 17:00:27 cadbbe2 kernel: [<c02d4703>] syscall_call+0x7/0xb
>> Jun 5 17:00:27 cadbbe2 kernel: Code: 00 8b 44 24 20 89 14 24 89 4c 24 
>> 04 8b 88 d8 fd ff ff 8b 98 dc fd ff ff 8b 54 24 04 8b 04 24 $
>>
>> Jun 5 17:00:27 cadbbe2 kernel: <0>Fatal exception: panic in 5 seconds
>>
>> The server was manually rebooted, and the filesystems mounted 
>> automatically. No issues have been experienced since this date.
>>
>> The only thing that would have been running at this time was an RMAN 
>> deletion of old backups from disk.
>>
>> Has anybody seen this issue before, or is there any advice on how to 
>> troubleshoot this.
>>
>> This is a production system, so I am limited in the tests that I can 
>> actually carry out.
>>
>> Thanks in advance
>> Stuart
>>
>>
>> IMPORTANT NOTICE: This message is intended for the addressee only. The content
>> may be confidential, legally privileged and protected by law. Unauthorised
>> use, copying or disclosure of any of it may be unlawful. If you are not the
>> intended recipient please notify the sender and remove it from your system.
>> Internet emails are not necessarily secure.  Although we have taken steps to
>> ensure this email and attachments are free from any virus, we advise that in
>> keeping with good computing practice you should ensure they are actually virus
>> free. The right to monitor email communications through our network is
>> reserved by us. 
>>  
>> Sopra Group Limited (Registered in England, No. 1588948) with Registered
>> Offices at: Middlesex House, Meadway Technology Park, Rutherford Close,
>> Stevenage, Hertfordshire, SG1 2EF.  VAT No. 366 9784 84.
>>     
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090617/9ad2c44b/attachment.html