[Ocfs-users] Oracle crash

Kendall, Kim Kim_Kendall at inter-tel.com
Wed Jul 25 13:32:33 PDT 2007


Had a pretty serious crash yesterday. 

*	4 node cluster 
*	SuSE 9.3 - 2.6.5-7.282-smp - x86_64
*	ocfs2 1.2.1-4.2

 

Saw this in the log on the first domino that fell. Is this a real
(dlmmaster) "bug" (and is there a fix)?

 

----------- [cut here ] --------- [please bite here ] ---------

Kernel BUG at dlmmaster:1661

invalid operand: 0000 [1] SMP

CPU 6

Pid: 15101, comm: mlragent Tainted: P   U   (2.6.5-7.282-smp
SLES9_SP3_BRANCH-20060829104040)

RIP: 0010:[<ffffffffa0508463>]
<ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803}

RSP: 0018:0000010825d37748  EFLAGS: 00010212

RAX: 0000000000000000 RBX: 0000010825d37798 RCX: 000000000003ffff

RDX: 0000000000000000 RSI: 000000000000c83c RDI: 0000010825a5be50

RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000

R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000001

R13: 000001082c4a9400 R14: 000000000000001f R15: 0000010828037d00

FS:  0000000000000000(0000) GS:ffffffff8057e880(005b)
knlGS:000000005afb5bb0

CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b

CR2: 00000000009a6008 CR3: 00000005fff85000 CR4: 00000000000006e0

Process mlragent (pid: 15101, threadinfo 0000010825d36000, task
000001082c50d420)

Stack: 0000010828037d00 0000000000000001 00000108ffffffea
000001082c50d420

       00000000000493e0 0000000028037ce0 0000000000000000
0000010825a5be88

       0000010825a5be00 ffffffea8010efe4

Call Trace:<ffffffffa050bd35>{:ocfs2_dlm:dlm_wait_for_lock_mastery+3077}

       <ffffffffa04c7da0>{:ocfs2_nodemanager:o2net_send_message+32}

       <ffffffffa050ac15>{:ocfs2_dlm:dlm_do_master_request+181}

       <ffffffffa0505d81>{:ocfs2_dlm:dlm_init_mle+417}
<ffffffffa050cd7b>{:ocfs2_dlm:dlm_get_lock_resource+2939}

       <ffffffffa051077e>{:ocfs2_dlm:dlm_new_lock+270}
<ffffffffa05aeeb0>{:ocfs2:ocfs2_inode_ast_func+0}

       <ffffffffa05118da>{:ocfs2_dlm:dlmlock+1946}
<ffffffffa0053571>{:reiserfs:inode2sd+257}

       <ffffffffa0067578>{:reiserfs:pathrelse+40}
<ffffffff8013d430>{autoremove_wake_function+0}

       <ffffffff8013d430>{autoremove_wake_function+0}
<ffffffffa05b07f8>{:ocfs2:ocfs2_lock_create+328}

       <ffffffffa05adfc0>{:ocfs2:ocfs2_inode_bast_func+0}

       <ffffffffa05b12bd>{:ocfs2:ocfs2_cluster_lock+557}
<ffffffff80164819>{unlock_page+9}

       <ffffffffa05acfc0>{:ocfs2:ocfs2_status_completion_cb+0}

       <ffffffffa05b2b7f>{:ocfs2:ocfs2_meta_lock_full+591}

       <ffffffffa005a0af>{:reiserfs:reiserfs_file_write+1919}

       <ffffffff801a8791>{dput+33} <ffffffff8019cd4e>{follow_mount+62}

       <ffffffff801a8791>{dput+33}
<ffffffffa05bb133>{:ocfs2:ocfs2_inode_revalidate+355}

       <ffffffffa05b6fc8>{:ocfs2:ocfs2_getattr+136}
<ffffffff80197849>{vfs_getattr_it+137}

       <ffffffff80197f4d>{vfs_lstat+189}
<ffffffff80122a68>{do_page_fault+536}

       <ffffffff8012864f>{sys32_lstat64+31}
<ffffffff8018d9a4>{sys_write+180}

       <ffffffff80125029>{sysenter_do_call+27}

 

Code: 0f 0b 28 d6 51 a0 ff ff ff ff 7d 06 eb 6f 83 f8 0b 75 6a 48

RIP <ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803} RSP
<0000010825d37748>

 <3>hde: Invalid capacity for disk in drive

hde: 763233kB, 19177/255/255 CHS, 1392 kBps, 65535 sector size, 0 rpm

hde: Invalid capacity for disk in drive

(15747,3):dlm_do_assert_master:1651 ERROR: during assert master of
M0000000000000000000207579b978e to 1, got -22.

(15747,3):dlm_print_one_mle:173 M0000000000000000000207579b978e: MAS
refs=  3 mas=  2 new=255 evt=Y inuse=1 maybe=[ 2 ], vote=[ 1 ], resp=[ 1
], node=[ 1 ],

----------- [cut here ] --------- [please bite here ] ---------

Kernel BUG at dlmmaster:1661

invalid operand: 0000 [2] SMP

CPU 3

Pid: 15747, comm: df Tainted: P   U   (2.6.5-7.282-smp
SLES9_SP3_BRANCH-20060829104040)

RIP: 0010:[<ffffffffa0508463>]
<ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803}

RSP: 0018:00000108259316e8  EFLAGS: 00010212

RAX: 0000000000000000 RBX: 0000010825931738 RCX: 0000000000000000

RDX: 0000000000000000 RSI: 000000000000d57f RDI: 00000108218f7e50

RBP: 0000000000000000 R08: 00000108386b8000 R09: 0000000000000000

R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000001

R13: 000001082c4a9400 R14: 000000000000001f R15: 0000010828746640

FS:  0000002a95894b00(0000) GS:ffffffff8057e700(0000)
knlGS:0000000000000000

CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b

CR2: 0000002a96af6940 CR3: 00000004fffdc000 CR4: 00000000000006e0

Process df (pid: 15747, threadinfo 0000010825930000, task
00000108374f0c60)

Stack: 0000010828746640 0000000000000001 00000108ffffffea
00000108374f0c60

       00000000000493e0 0000000028746660 0000000000000000
00000108218f7e88

       00000108218f7e00 ffffffea8013861d

Call Trace:<ffffffffa050bd35>{:ocfs2_dlm:dlm_wait_for_lock_mastery+3077}

       <ffffffffa04c7da0>{:ocfs2_nodemanager:o2net_send_message+32}

       <ffffffffa050ac15>{:ocfs2_dlm:dlm_do_master_request+181}

       <ffffffffa0505d81>{:ocfs2_dlm:dlm_init_mle+417}
<ffffffffa050cd7b>{:ocfs2_dlm:dlm_get_lock_resource+2939}

       <ffffffffa051077e>{:ocfs2_dlm:dlm_new_lock+270}
<ffffffffa05aeeb0>{:ocfs2:ocfs2_inode_ast_func+0}

       <ffffffffa05118da>{:ocfs2_dlm:dlmlock+1946}
<ffffffff8013456a>{recalc_task_prio+938}

       <ffffffffa05b07f8>{:ocfs2:ocfs2_lock_create+328}
<ffffffffa05adfc0>{:ocfs2:ocfs2_inode_bast_func+0}

       <ffffffffa05b12bd>{:ocfs2:ocfs2_cluster_lock+557}
<ffffffff80265d38>{n_tty_receive_buf+4824}

       <ffffffffa05acfc0>{:ocfs2:ocfs2_status_completion_cb+0}

       <ffffffffa05b2b7f>{:ocfs2:ocfs2_meta_lock_full+591}

       <ffffffff80164290>{file_read_actor+0}
<ffffffff801a99b3>{igrab+35}

       <ffffffffa05d93d7>{:ocfs2:ocfs2_get_system_file_inode+71}

       <ffffffff80267ac1>{pty_write+305}
<ffffffffa05d753f>{:ocfs2:ocfs2_statfs+287}

       <ffffffff8018a421>{vfs_statfs+113}
<ffffffff8018c392>{vfs_statfs_native+34}

       <ffffffff8018c592>{sys_statfs+178}
<ffffffff80265da0>{write_chan+0}

       <ffffffff80260251>{tty_write+625}
<ffffffff8018d75d>{vfs_write+285}

       <ffffffff8018d98d>{sys_write+157}
<ffffffff80110f79>{error_exit+0}

       <ffffffff801106b4>{system_call+124}

 

Code: 0f 0b 28 d6 51 a0 ff ff ff ff 7d 06 eb 6f 83 f8 0b 75 6a 48

RIP <ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803} RSP
<00000108259316e8> 
 
The information contained in this E-mail may be confidential and/or proprietary to Inter-Tel and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs-users/attachments/20070725/37c7c9bd/attachment-0001.html


More information about the Ocfs-users mailing list