<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1528" name=GENERATOR></HEAD>
<BODY>
<DIV><FONT face="Microsoft Sans Serif" size=2>Hi,</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face="Microsoft Sans Serif" size=2>Here is the configuration on both
hosts:<BR>Oracle: 10.2.0.1<BR>Oracle home: OCFS2 shared<BR>Oracle data files:
OCFS2 shared</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face="Microsoft Sans Serif" size=2># cat redhat-release<BR>Red Hat
Enterprise Linux ES release 4 (Nahant Update 2)</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face="Microsoft Sans Serif" size=2># rpm -qa | grep -i
device<BR>device-mapper-1.01.04-1.0.RHEL4<BR>device-mapper-1.01.04-1.0.RHEL4</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face="Microsoft Sans Serif" size=2># rpm -qa | grep -i
ocfs<BR>ocfs2-tools-1.2.0-1<BR>ocfs2console-1.2.0-1<BR>ocfs2-2.6.9-22.ELsmp-1.2.0-1</FONT></DIV>
<DIV> </DIV><FONT face="Microsoft Sans Serif" size=2>
<DIV><BR>We are testing OCFS2 with Linux multipathing.<BR>When a path is
removed, both the cluster nodes panic or fences with a failure <BR>to receive
heatbeat event.</DIV>
<DIV> </DIV>
<DIV>After we remove the path we see the I/O on the other path on the storage
array <BR>and then cluster fences after a min or so and panics the nodes.<BR>We
also modified the timeout threads hold to 601 but problem still persist and
<BR>also tried the deadline I/O scheduler and the problem persists.</DIV>
<DIV> </DIV>
<DIV>Console message from the
host<BR>-------------------------------------------------</DIV>
<DIV> </DIV>
<DIV>Host1 <BR>=============<BR>Kernel BUG at panic:74<BR>invalid operand: 0000
[1] SMP <BR>CPU 0 <BR>Modules linked in: md5 ipv6 parport_pc lp parport autofs4
i2c_dev i2c_core ocfs2<BR>(U) debugfs(U) ocfs2_dlmfs(U) ocfs2_dlm(U)
ocfs2_nodemanager(U) configfs(U) <BR>sunrpc ds yenta_socket pcmcia_core
dm_mirror dm_mod hw_random egenera_nmi(U) <BR>egenera_veth(U) sd_mod
egenera_vscsi(U) scsi_mod egenera_vmdump(U) <BR>egenera_dumpdev(U)
egenera_ipmi(U) egenera_base(U) egenera_virtual_bus(U) <BR>egenera_fs(U) ext3
jbd<BR>Pid: 6, comm: events/0 Tainted: PF
2.6.9-22.ELsmp<BR>RIP: 0010:[<ffffffff801368c2>]
<ffffffff801368c2>{panic+211}<BR>RSP: 0018:000001020fd81d88 EFLAGS:
00010282<BR>RAX: 000000000000005a RBX: ffffffffa01d1778 RCX:
0000000000000246<BR>RDX: 000000000000445b RSI: 0000000000000246 RDI:
ffffffff803d7960<BR>RBP: 000001020e6ffce0 R08: 0000000000000246 R09:
ffffffffa01d1778<BR>R10: 0000000000000046 R11: 0000000000000000 R12:
000001000c03ed40<BR>R13: 0000000000000216 R14: 000001020e6ffc00 R15:
ffffffffa01c6042<BR>FS: 0000002a9589fb00(0000) GS:ffffffff804d3100(0000)
knlGS:00000000f7fdf6c0<BR>CS: 0010 DS: 0018 ES: 0018 CR0:
000000008005003b<BR>CR2: 0000007fbffff816 CR3: 0000000000101000 CR4:
00000000000006e0<BR>Process events/0 (pid: 6, threadinfo 000001020fd80000, task
00000100efefd7f0)<BR>Stack: 0000003000000008 000001020fd81e68 000001020fd81da8
0000000000000006 <BR> 0000000000000000
0000000000000246 ffffffffa01dd1b0 ffffffffa01dd160
<BR> ffffffff803d7948 000001020e6ffcd8
<BR>Call
Trace:<ffffffffa01c8c2a>{:ocfs2_nodemanager:o2hb_stop_all_regions+95}
<BR>
<ffffffffa01ca4f4>{:ocfs2_nodemanager:o2quo_disk_timeout+0}
<BR>
<ffffffff801464f2>{worker_thread+419}
<ffffffff80132e8d><BR>{default_wake_function+0}
<BR>
<ffffffff80132ede>{__wake_up_common+67}
<ffffffff80132e8d><BR>{default_wake_function+0}
<BR>
<ffffffff8014634f>{worker_thread+0} <ffffffff8014a167>{kthread+200}
<BR> <ffffffff80110ca3>{child_rip+8}
<ffffffff8014a09f>{kthread+0} <BR>
<ffffffff80110c9b>{child_rip+0} </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>Code: 0f 0b 3a 71 31 80 ff ff ff ff 4a 00 31 ff e8 d7 c4 fe ff e8 <BR>RIP
<ffffffff801368c2>{panic+211} RSP <000001020fd81d88><BR>Dumping to
/dev/egenera_dump_dev_ifca...<BR>Writing dump header ...<BR><6>dumpdev:
file (/crash_dumps/ap7.1147734852.dmp) opened<BR>Writing dump pages
................<BR>Dump complete.<BR>rebooting.</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>Host 2<BR>============</DIV>
<DIV> </DIV>
<DIV>[root@eg09 ~]# (6,0):o2hb_write_timeout:164 ERROR: Heartbeat write timeout
to <BR>device sdc1 after 90000 milliseconds<BR>(6,0):o2hb_stop_all_regions:1727
ERROR: stopping heartbeat on all active <BR>regions.<BR>Kernel panic - not
syncing: ocfs2 is very sorry to be fencing this system by <BR>panicing</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>----------- [cut here ] --------- [please bite here ] ---------<BR>Kernel
BUG at panic:74<BR>invalid operand: 0000 [1] SMP <BR>CPU 0 <BR>Modules linked
in: md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core ocfs2<BR>(U)
debugfs(U) ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs(U)
<BR>sunrpc ds yenta_socket pcmcia_core dm_mirror dm_mod hw_random egenera_nmi(U)
<BR>egenera_veth(U) sd_mod egenera_vscsi(U) scsi_mod egenera_vmdump(U)
<BR>egenera_dumpdev(U) egenera_ipmi(U) egenera_base(U) egenera_virtual_bus(U)
<BR>egenera_fs(U) ext3 jbd<BR>Pid: 6, comm: events/0 Tainted:
PF 2.6.9-22.ELsmp<BR>RIP:
0010:[<ffffffff801368c2>] <ffffffff801368c2>{panic+211}<BR>RSP:
0018:000001020fd81d88 EFLAGS: 00010282<BR>RAX: 000000000000005a RBX:
ffffffffa01d1778 RCX: 0000000000000246<BR>RDX: 0000000000004345 RSI:
0000000000000246 RDI: ffffffff803d7960<BR>RBP: 000001010c043ce0 R08:
0000000000000246 R09: ffffffffa01d1778<BR>R10: 0000000000000046 R11:
0000000000000000 R12: 000001000c03ed40<BR>R13: 0000000000000216 R14:
000001010c043c00 R15: ffffffffa01c6042<BR>FS: 0000002a9589fb00(0000)
GS:ffffffff804d3100(0000) knlGS:00000000f7fdf6c0<BR>CS: 0010 DS: 0018 ES:
0018 CR0: 000000008005003b<BR>CR2: 000000332988ed20 CR3: 0000000000101000 CR4:
00000000000006e0<BR>Process events/0 (pid: 6, threadinfo 000001020fd80000, task
00000100efefd7f0)<BR>Stack: 0000003000000008 000001020fd81e68 000001020fd81da8
0000000000000006 <BR> 0000000000000000
0000000000000246 ffffffffa01dd1b0 ffffffffa01dd160
<BR> ffffffff803d7948 000001010c043cd8
<BR>Call
Trace:<ffffffffa01c8c2a>{:ocfs2_nodemanager:o2hb_stop_all_regions+95}
<BR>
<ffffffffa01ca4f4>{:ocfs2_nodemanager:o2quo_disk_timeout+0}
<BR>
<ffffffff801464f2>{worker_thread+419}
<ffffffff80132e8d><BR>{default_wake_function+0}
<BR>
<ffffffff80132ede>{__wake_up_common+67}
<ffffffff80132e8d><BR>{default_wake_function+0}
<BR>
<ffffffff8014634f>{worker_thread+0} <ffffffff8014a167>{kthread+200}
<BR> <ffffffff80110ca3>{child_rip+8}
<ffffffff8014a09f>{kthread+0} <BR>
<ffffffff80110c9b>{child_rip+0} </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>Code: 0f 0b 3a 71 31 80 ff ff ff ff 4a 00 31 ff e8 d7 c4 fe ff e8 <BR>RIP
<ffffffff801368c2>{panic+211} RSP <000001020fd81d88><BR>Dumping to
/dev/egenera_dump_dev_ifca...<BR>Writing dump header ...<BR><6>dumpdev:
file (/crash_dumps/ap8.1147734852.dmp) opened<BR>Writing dump pages
.............<BR>Dump complete.</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>Thanks in advance,<BR>Roger---<BR></FONT></DIV></BODY></HTML>