[Ocfs-users] Oracle crash
Kendall, Kim
Kim_Kendall at inter-tel.com
Wed Jul 25 13:32:33 PDT 2007
Had a pretty serious crash yesterday.
* 4 node cluster
* SuSE 9.3 - 2.6.5-7.282-smp - x86_64
* ocfs2 1.2.1-4.2
Saw this in the log on the first domino that fell. Is this a real
(dlmmaster) "bug" (and is there a fix)?
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dlmmaster:1661
invalid operand: 0000 [1] SMP
CPU 6
Pid: 15101, comm: mlragent Tainted: P U (2.6.5-7.282-smp
SLES9_SP3_BRANCH-20060829104040)
RIP: 0010:[<ffffffffa0508463>]
<ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803}
RSP: 0018:0000010825d37748 EFLAGS: 00010212
RAX: 0000000000000000 RBX: 0000010825d37798 RCX: 000000000003ffff
RDX: 0000000000000000 RSI: 000000000000c83c RDI: 0000010825a5be50
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000001
R13: 000001082c4a9400 R14: 000000000000001f R15: 0000010828037d00
FS: 0000000000000000(0000) GS:ffffffff8057e880(005b)
knlGS:000000005afb5bb0
CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000009a6008 CR3: 00000005fff85000 CR4: 00000000000006e0
Process mlragent (pid: 15101, threadinfo 0000010825d36000, task
000001082c50d420)
Stack: 0000010828037d00 0000000000000001 00000108ffffffea
000001082c50d420
00000000000493e0 0000000028037ce0 0000000000000000
0000010825a5be88
0000010825a5be00 ffffffea8010efe4
Call Trace:<ffffffffa050bd35>{:ocfs2_dlm:dlm_wait_for_lock_mastery+3077}
<ffffffffa04c7da0>{:ocfs2_nodemanager:o2net_send_message+32}
<ffffffffa050ac15>{:ocfs2_dlm:dlm_do_master_request+181}
<ffffffffa0505d81>{:ocfs2_dlm:dlm_init_mle+417}
<ffffffffa050cd7b>{:ocfs2_dlm:dlm_get_lock_resource+2939}
<ffffffffa051077e>{:ocfs2_dlm:dlm_new_lock+270}
<ffffffffa05aeeb0>{:ocfs2:ocfs2_inode_ast_func+0}
<ffffffffa05118da>{:ocfs2_dlm:dlmlock+1946}
<ffffffffa0053571>{:reiserfs:inode2sd+257}
<ffffffffa0067578>{:reiserfs:pathrelse+40}
<ffffffff8013d430>{autoremove_wake_function+0}
<ffffffff8013d430>{autoremove_wake_function+0}
<ffffffffa05b07f8>{:ocfs2:ocfs2_lock_create+328}
<ffffffffa05adfc0>{:ocfs2:ocfs2_inode_bast_func+0}
<ffffffffa05b12bd>{:ocfs2:ocfs2_cluster_lock+557}
<ffffffff80164819>{unlock_page+9}
<ffffffffa05acfc0>{:ocfs2:ocfs2_status_completion_cb+0}
<ffffffffa05b2b7f>{:ocfs2:ocfs2_meta_lock_full+591}
<ffffffffa005a0af>{:reiserfs:reiserfs_file_write+1919}
<ffffffff801a8791>{dput+33} <ffffffff8019cd4e>{follow_mount+62}
<ffffffff801a8791>{dput+33}
<ffffffffa05bb133>{:ocfs2:ocfs2_inode_revalidate+355}
<ffffffffa05b6fc8>{:ocfs2:ocfs2_getattr+136}
<ffffffff80197849>{vfs_getattr_it+137}
<ffffffff80197f4d>{vfs_lstat+189}
<ffffffff80122a68>{do_page_fault+536}
<ffffffff8012864f>{sys32_lstat64+31}
<ffffffff8018d9a4>{sys_write+180}
<ffffffff80125029>{sysenter_do_call+27}
Code: 0f 0b 28 d6 51 a0 ff ff ff ff 7d 06 eb 6f 83 f8 0b 75 6a 48
RIP <ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803} RSP
<0000010825d37748>
<3>hde: Invalid capacity for disk in drive
hde: 763233kB, 19177/255/255 CHS, 1392 kBps, 65535 sector size, 0 rpm
hde: Invalid capacity for disk in drive
(15747,3):dlm_do_assert_master:1651 ERROR: during assert master of
M0000000000000000000207579b978e to 1, got -22.
(15747,3):dlm_print_one_mle:173 M0000000000000000000207579b978e: MAS
refs= 3 mas= 2 new=255 evt=Y inuse=1 maybe=[ 2 ], vote=[ 1 ], resp=[ 1
], node=[ 1 ],
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dlmmaster:1661
invalid operand: 0000 [2] SMP
CPU 3
Pid: 15747, comm: df Tainted: P U (2.6.5-7.282-smp
SLES9_SP3_BRANCH-20060829104040)
RIP: 0010:[<ffffffffa0508463>]
<ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803}
RSP: 0018:00000108259316e8 EFLAGS: 00010212
RAX: 0000000000000000 RBX: 0000010825931738 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000d57f RDI: 00000108218f7e50
RBP: 0000000000000000 R08: 00000108386b8000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000001
R13: 000001082c4a9400 R14: 000000000000001f R15: 0000010828746640
FS: 0000002a95894b00(0000) GS:ffffffff8057e700(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000002a96af6940 CR3: 00000004fffdc000 CR4: 00000000000006e0
Process df (pid: 15747, threadinfo 0000010825930000, task
00000108374f0c60)
Stack: 0000010828746640 0000000000000001 00000108ffffffea
00000108374f0c60
00000000000493e0 0000000028746660 0000000000000000
00000108218f7e88
00000108218f7e00 ffffffea8013861d
Call Trace:<ffffffffa050bd35>{:ocfs2_dlm:dlm_wait_for_lock_mastery+3077}
<ffffffffa04c7da0>{:ocfs2_nodemanager:o2net_send_message+32}
<ffffffffa050ac15>{:ocfs2_dlm:dlm_do_master_request+181}
<ffffffffa0505d81>{:ocfs2_dlm:dlm_init_mle+417}
<ffffffffa050cd7b>{:ocfs2_dlm:dlm_get_lock_resource+2939}
<ffffffffa051077e>{:ocfs2_dlm:dlm_new_lock+270}
<ffffffffa05aeeb0>{:ocfs2:ocfs2_inode_ast_func+0}
<ffffffffa05118da>{:ocfs2_dlm:dlmlock+1946}
<ffffffff8013456a>{recalc_task_prio+938}
<ffffffffa05b07f8>{:ocfs2:ocfs2_lock_create+328}
<ffffffffa05adfc0>{:ocfs2:ocfs2_inode_bast_func+0}
<ffffffffa05b12bd>{:ocfs2:ocfs2_cluster_lock+557}
<ffffffff80265d38>{n_tty_receive_buf+4824}
<ffffffffa05acfc0>{:ocfs2:ocfs2_status_completion_cb+0}
<ffffffffa05b2b7f>{:ocfs2:ocfs2_meta_lock_full+591}
<ffffffff80164290>{file_read_actor+0}
<ffffffff801a99b3>{igrab+35}
<ffffffffa05d93d7>{:ocfs2:ocfs2_get_system_file_inode+71}
<ffffffff80267ac1>{pty_write+305}
<ffffffffa05d753f>{:ocfs2:ocfs2_statfs+287}
<ffffffff8018a421>{vfs_statfs+113}
<ffffffff8018c392>{vfs_statfs_native+34}
<ffffffff8018c592>{sys_statfs+178}
<ffffffff80265da0>{write_chan+0}
<ffffffff80260251>{tty_write+625}
<ffffffff8018d75d>{vfs_write+285}
<ffffffff8018d98d>{sys_write+157}
<ffffffff80110f79>{error_exit+0}
<ffffffff801106b4>{system_call+124}
Code: 0f 0b 28 d6 51 a0 ff ff ff ff 7d 06 eb 6f 83 f8 0b 75 6a 48
RIP <ffffffffa0508463>{:ocfs2_dlm:dlm_do_assert_master+803} RSP
<00000108259316e8>
The information contained in this E-mail may be confidential and/or proprietary to Inter-Tel and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs-users/attachments/20070725/37c7c9bd/attachment-0001.html
More information about the Ocfs-users
mailing list