[Ocfs2-users] OCFS2 Caused RAC server to crash

McDonald, Stuart smcdonald at uk.sopragroup.com
Wed Jun 17 02:30:52 PDT 2009


Hi

We have a two-node RAC cluster, which uses ASM for the database storage,
but is using OCFS2 to mount a couple of file systems for a) the RMAN
backups, and b) Oracle Data files, i.e. files read from or written to
DBA Directories.

One of the servers in the cluster crashed, and checks revealed the
following error messages:

Jun 5 17:00:27 cadbbe2 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000bc4
Jun 5 17:00:27 cadbbe2 kernel: printing eip:
Jun 5 17:00:27 cadbbe2 kernel: f9125eb2
Jun 5 17:00:27 cadbbe2 kernel: *pde = 09ac4001
Jun 5 17:00:27 cadbbe2 kernel: Oops: 0002 [#1]
Jun 5 17:00:27 cadbbe2 kernel: SMP
Jun 5 17:00:27 cadbbe2 kernel: Modules linked in: hangcheck_timer
parport_pc lp parport oracleasm(U) autofs4 i2c_dev i2c_core ocfs$
Jun 5 17:00:27 cadbbe2 kernel: CPU: 3
Jun 5 17:00:27 cadbbe2 kernel: EIP: 0060:[<f9125eb2>] Tainted: P VLI
Jun 5 17:00:27 cadbbe2 kernel: EFLAGS: 00010246 (2.6.9-42.ELsmp)
Jun 5 17:00:27 cadbbe2 kernel: EIP is at
ocfs2_free_suballoc_bits+0x4ca/0x766 [ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: eax: 00000000 ebx: 00000000 ecx: 0000000b
edx: 00000000
Jun 5 17:00:27 cadbbe2 kernel: esi: 00005c28 edi: 000000ff ebp: d828b000
esp: d643cdb0
Jun 5 17:00:27 cadbbe2 kernel: ds: 007b es: 007b ss: 0068
Jun 5 17:00:27 cadbbe2 kernel: Process rm (pid: 27647,
threadinfo=d643c000 task=d520f6b0)
Jun 5 17:00:27 cadbbe2 kernel: Stack: 0000000b 00000000 00000000
00000001 f19e9d48 f2fa80c0 f2fa8000 d721489c
Jun 5 17:00:27 cadbbe2 kernel: f301a728 f6e2fa40 f19e9d48 f3dd5880
0000000c 00000000 f3dd5880 f912646f
Jun 5 17:00:27 cadbbe2 kernel: 00005b29 0d478a00 00000000 00000100
0d47e529 f6d60c00 d721489c f301a728
Jun 5 17:00:27 cadbbe2 kernel: Call Trace:
Jun 5 17:00:27 cadbbe2 kernel: [<f912646f>]
ocfs2_free_clusters+0x2ad/0x37a [ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<f90f872d>]
ocfs2_replay_truncate_records+0x2e4/0x3c7 [ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<f90f8b80>]
__ocfs2_flush_truncate_log+0x370/0x45b [ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<f90fad8d>]
ocfs2_commit_truncate+0x508/0x8a3 [ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<f910f347>]
ocfs2_truncate_for_delete+0x255/0x312 [ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<f910fb54>] ocfs2_wipe_inode+0x1aa/0x2ee
[ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<f9110555>]
ocfs2_delete_inode+0x2d6/0x450 [ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<f911027f>] ocfs2_delete_inode+0x0/0x450
[ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<c0171954>]
generic_delete_inode+0xa2/0x104
Jun 5 17:00:27 cadbbe2 kernel: [<f9111407>] ocfs2_drop_inode+0xe6/0x12a
[ocfs2]
Jun 5 17:00:27 cadbbe2 kernel: [<c0171b30>] iput+0x5f/0x61
Jun 5 17:00:27 cadbbe2 kernel: [<c0168e46>] sys_unlink+0xd7/0x132
Jun 5 17:00:27 cadbbe2 kernel: [<c015bcbc>] fget+0x3b/0x42
Jun 5 17:00:27 cadbbe2 kernel: [<c016add6>] sys_ioctl+0x227/0x269
Jun 5 17:00:27 cadbbe2 kernel: [<c016ae0c>] sys_ioctl+0x25d/0x269
Jun 5 17:00:27 cadbbe2 kernel: [<c02d4703>] syscall_call+0x7/0xb
Jun 5 17:00:27 cadbbe2 kernel: Code: 00 8b 44 24 20 89 14 24 89 4c 24 04
8b 88 d8 fd ff ff 8b 98 dc fd ff ff 8b 54 24 04 8b 04 24 $
Jun 5 17:00:27 cadbbe2 kernel: <0>Fatal exception: panic in 5 seconds

The server was manually rebooted, and the filesystems mounted
automatically. No issues have been experienced since this date.

The only thing that would have been running at this time was an RMAN
deletion of old backups from disk.

Has anybody seen this issue before, or is there any advice on how to
troubleshoot this. 

This is a production system, so I am limited in the tests that I can
actually carry out.

Thanks in advance
Stuart



IMPORTANT NOTICE: This message is intended for the addressee only. The content
may be confidential, legally privileged and protected by law. Unauthorised
use, copying or disclosure of any of it may be unlawful. If you are not the
intended recipient please notify the sender and remove it from your system.
Internet emails are not necessarily secure.  Although we have taken steps to
ensure this email and attachments are free from any virus, we advise that in
keeping with good computing practice you should ensure they are actually virus
free. The right to monitor email communications through our network is
reserved by us. 
 
Sopra Group Limited (Registered in England, No. 1588948) with Registered
Offices at: Middlesex House, Meadway Technology Park, Rutherford Close,
Stevenage, Hertfordshire, SG1 2EF.  VAT No. 366 9784 84.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090617/73ee580c/attachment.html 


More information about the Ocfs2-users mailing list