<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1498" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>IT IS NOT NORMAL. Something wrong with your
storage, FC switch or cards. Why , when you shutdown one node, second node
experience IO errors?</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<BLOCKQUOTE
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>
<A title=SRuff@fiberlink.com
href="mailto:SRuff@fiberlink.com">SRuff@fiberlink.com</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A title=ocfs2-users@oss.oracle.com
href="mailto:ocfs2-users@oss.oracle.com">ocfs2-users@oss.oracle.com</A> </DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Thursday, September 21, 2006 2:56
PM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> [Ocfs2-users] ocfs2 fencing on
reboot of 2nd node</DIV>
<DIV><BR></DIV><BR><FONT face=sans-serif size=2>I'm performing some testing
with ocfs2 on 2 nodes with Red Hat AS4 Update 4 (x86_64) and (mulitpath
included in the 2.6 kernel) and am runing into some issues when cleanly
rebooting the 2nd node, while the 1st node is still up.</FONT> <BR><BR><FONT
face=sans-serif size=2>So if I do the following on the 2nd node, the 1st node
does not fence itself:</FONT> <BR><BR><FONT face=sans-serif
size=2>/etc/init.d/ocfs2 stop</FONT> <BR><FONT face=sans-serif
size=2>/etc/init.d/o2cb stop</FONT> <BR><FONT face=sans-serif size=2>wait more
than 60 seconds</FONT> <BR><FONT face=sans-serif size=2>init 6</FONT>
<BR><BR><FONT face=sans-serif size=2>I get the following on the 1st node, but
everything is fine:</FONT> <BR><BR><FONT color=#ff0000><FONT face=sans-serif
size=2>Sep 21 21:44:49 bbflgrid11 kernel: SCSI error : <0 0 0 12> return
code = 0x20000</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:44:49
bbflgrid11 kernel: end_request: I/O error, dev sdm, sector 192785</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11 kernel:
device-mapper: dm-multipath: Failing path 8:192.</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11 kernel: SCSI error : <0 0
0 14> return code = 0x20000</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:44:49 bbflgrid11 kernel: end_request: I/O error, dev sdo, sector
193297</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11
kernel: device-mapper: dm-multipath: Failing path 8:224.</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11 kernel: SCSI error : <0 0
0 13> return code = 0x20000</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:44:49 bbflgrid11 kernel: end_request: I/O error, dev sdn, sector
192785</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11
kernel: device-mapper: dm-multipath: Failing path 8:208.</FONT>
<BR></FONT><FONT face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11 multipathd:
8:192: mark as failed</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:44:49
bbflgrid11 multipathd: mpath1: remaining active paths: 1</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11 multipathd: 8:224: mark as
failed</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11
multipathd: mpath3: remaining active paths: 1</FONT> <BR><FONT face=sans-serif
size=2>Sep 21 21:44:49 bbflgrid11 multipathd: 8:208: mark as failed</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:44:49 bbflgrid11 multipathd:
mpath2: remaining active paths: 1</FONT> <BR><FONT face=sans-serif size=2>Sep
21 21:44:58 bbflgrid11 multipathd: 8:192: readsector0 checker reports path is
up</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:44:58 bbflgrid11
multipathd: 8:192: reinstated</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:44:58 bbflgrid11 multipathd: mpath1: remaining active paths: 2</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:44:58 bbflgrid11 multipathd: 8:208:
readsector0 checker reports path is up</FONT> <BR><FONT face=sans-serif
size=2>Sep 21 21:44:58 bbflgrid11 multipathd: 8:208: reinstated</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:44:58 bbflgrid11 multipathd:
mpath2: remaining active paths: 2</FONT> <BR><FONT face=sans-serif size=2>Sep
21 21:44:58 bbflgrid11 multipathd: 8:224: readsector0 checker reports path is
up</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:44:58 bbflgrid11
multipathd: 8:224: reinstated</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:44:58 bbflgrid11 multipathd: mpath3: remaining active paths: 2</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:46:06 bbflgrid11 kernel: SCSI error
: <1 0 0 11> return code = 0x20000</FONT> <BR><FONT face=sans-serif
size=2>Sep 21 21:46:06 bbflgrid11 kernel: end_request: I/O error, dev sdaa,
sector 1920</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:46:06 bbflgrid11
kernel: device-mapper: dm-multipath: Failing path 65:160.</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:46:06 bbflgrid11 multipathd: 65:160: mark as
failed</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:46:06 bbflgrid11
multipathd: mpath0: remaining active paths: 1</FONT> <BR><FONT face=sans-serif
size=2>Sep 21 21:46:06 bbflgrid11 multipathd: 65:160: readsector0 checker
reports path is up</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:46:06
bbflgrid11 multipathd: 65:160: reinstated</FONT> <BR><FONT face=sans-serif
size=2>Sep 21 21:46:06 bbflgrid11 multipathd: mpath0: remaining active paths:
2</FONT> <BR><BR><BR><BR><FONT face=sans-serif size=2>Now if I do the
following on the 2nd node, the 1st node fences itself (same as above, except
dont wait 60 seconds after o2cb stop)</FONT> <BR><BR><FONT face=sans-serif
size=2>/etc/init.d/ocfs2 stop</FONT> <BR><FONT face=sans-serif
size=2>/etc/init.d/o2cb stop</FONT> <BR><FONT face=sans-serif size=2>init
6</FONT> <BR><BR><FONT face=sans-serif size=2>Node 1 logs the following and
fences itself, I have to power cycle the server to get it back, it doesn't
reboot or shutdown just hangs</FONT> <BR><BR><FONT face=sans-serif size=2>Sep
21 21:28:00 bbflgrid11 kernel: SCSI error : <0 0 0 13> return code =
0x20000</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11
kernel: end_request: I/O error, dev sdn, sector 192785</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11 kernel: device-mapper:
dm-multipath: Failing path 8:208.</FONT> <BR><FONT face=sans-serif size=2>Sep
21 21:28:00 bbflgrid11 multipathd: 8:208: mark as failed</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11 multipathd: mpath2:
remaining active paths: 1</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:28:00 bbflgrid11 kernel: SCSI error : <1 0 0 12> return code =
0x20000</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11
kernel: end_request: I/O error, dev sdab, sector 192784</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11 kernel: end_request: I/O
error, dev sdab, sector 192786</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:28:00 bbflgrid11 kernel: device-mapper: dm-multipath: Failing path
65:176.</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11
kernel: SCSI error : <1 0 0 13> return code = 0x20000</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11 kernel: end_request: I/O
error, dev sdac, sector 192785</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:28:00 bbflgrid11 kernel: device-mapper: dm-multipath: Failing path
65:192.</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:28:00 bbflgrid11
multipathd: 65:176: mark as failed</FONT> <BR><FONT face=sans-serif size=2>Sep
21 21:28:00 bbflgrid11 multipathd: mpath1: remaining active paths: 1</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:28:01 bbflgrid11 multipathd:
65:192: mark as failed</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:28:01
bbflgrid11 multipathd: mpath2: remaining active paths: 0</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:01 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:01 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:01 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:01 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:01 bbflgrid11 multipathd: 65:176:
readsector0 checker reports path is up</FONT> <BR><FONT face=sans-serif
size=2>Sep 21 21:28:01 bbflgrid11 multipathd: 65:176: reinstated</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:28:01 bbflgrid11 multipathd:
mpath1: remaining active paths: 2</FONT> <BR><FONT face=sans-serif size=2>Sep
21 21:28:03 bbflgrid11 kernel: (4912,1):o2hb_bio_end_io:331 ERROR: IO Error
-5</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:28:03 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:03 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:03 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:05 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:05 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:05 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:05 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:07 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:07 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:07 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:07 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:09 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:09 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:09 bbflgrid11 kernel:
(4912,1):o2hb_bio_end_io:331 ERROR: IO Error -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:09 bbflgrid11 kernel:
(4912,1):o2hb_do_disk_heartbeat:973 ERROR: status = -5</FONT> <BR><FONT
face=sans-serif size=2>Sep 21 21:28:09 bbflgrid11 multipathd: 8:208:
readsector0 checker reports path is up</FONT> <BR><FONT face=sans-serif
size=2>Sep 21 21:28:09 bbflgrid11 multipathd: 8:208: reinstated</FONT>
<BR><FONT face=sans-serif size=2>Sep 21 21:28:09 bbflgrid11 multipathd:
mpath2: remaining active paths: 1</FONT> <BR><FONT face=sans-serif size=2>Sep
21 21:28:10 bbflgrid11 multipathd: 65:192: readsector0 checker reports path is
up</FONT> <BR><FONT face=sans-serif size=2>Sep 21 21:28:10 bbflgrid11
multipathd: 65:192: reinstated</FONT> <BR><FONT face=sans-serif size=2>Sep 21
21:28:10 bbflgrid11 multipathd: mpath2: remaining active paths: 2</FONT>
<BR><BR><BR><FONT face=sans-serif size=2>...</FONT> <BR><FONT face=sans-serif
size=2>Index 14: took 0 ms to do submit_bio for read</FONT> <BR><FONT
face=sans-serif size=2>Index 15: took 0 ms to do waiting for read
completion</FONT> <BR><FONT face=sans-serif
size=2>(11,1):o2hb_stop_all_regions:1908 ERROR: stopping heartbeat on all
active regions</FONT> <BR><FONT face=sans-serif size=2>Kernel panic - not
syncing: ocfs2 is very sorry to be fencing this system by
panicing</FONT> <BR><BR><BR><FONT face=sans-serif size=2>Seems like if I wait
for the node 1 to heartbeat to node 2, with o2c down, before rebooting it's
fine, but if I reboot before node 1 has had a chance to hearbeat to node 2,
with o2cb down, it's panics.</FONT> <BR><BR><FONT face=sans-serif
size=2><BR><BR>Shawn E. Ruff<BR>Senior Oracle DBA<BR>Fiberlink
Communications<BR><BR>The information transmitted is intended only for the
person or entity to which it is addressed and may contain confidential and/or
privileged material. Any review, retransmission, dissemination or other
use of, or taking of any action in reliance upon, this information by persons
or entities other than the intended recipient is prohibited. If you
received this in error, please contact the sender and delete the material from
any computer.<BR><BR></FONT>
<P>
<HR>
<P></P>_______________________________________________<BR>Ocfs2-users mailing
list<BR>Ocfs2-users@oss.oracle.com<BR>http://oss.oracle.com/mailman/listinfo/ocfs2-users<BR></BLOCKQUOTE></BODY></HTML>