<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">This is due to a race in lock
mastery/purge. I have recently fixed this problem but haven't yet
submitted the patch to mainline. Please file a Service request
with Oracle to get a one-off fix.<br>
<br>
On 02/25/2013 10:07 PM, Ramappa, Ravi (NSN - IN/Bangalore) wrote:<br>
</div>
<blockquote
cite="mid:EB937501617BFC44BE90EDA24CA6A13A0117EB@SGSIMBX003.nsn-intra.net"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from rtf -->
<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
<font face="Calibri" size="2"><span style="font-size:11pt;">
<div>Hi,</div>
<div> </div>
<div>In a 13 node cluster, the first four nodes went into
kernel panic state. The /var/log/messages contained messages
as below,</div>
<div> </div>
<div>Feb 25 22:02:46 prod152 kernel:
(o2net,9470,5):dlm_assert_master_handler:1817 ERROR: DIE!
Mastery assert from 4, but current owner is 2!
(O000000000000000c36706200000000)</div>
<div>Feb 25 22:02:46 prod152 kernel: lockres:
O000000000000000c36706200000000, owner=2, state=0</div>
<div>Feb 25 22:02:46 prod152 kernel: last used: 0, refcnt:
3, on purge list: no</div>
<div>Feb 25 22:02:46 prod152 kernel: on dirty list: no, on
reco list: no, migrating pending: no</div>
<div>Feb 25 22:02:46 prod152 kernel: inflight locks: 0, asts
reserved: 0</div>
<div>Feb 25 22:02:46 prod152 kernel: refmap nodes: [ ],
inflight=0</div>
<div>Feb 25 22:02:46 prod152 kernel: granted queue:</div>
<div>Feb 25 22:02:46 prod152 kernel: type=3, conv=-1,
node=2, cookie=2:222205, ref=2, ast=(empty=y,pend=n),
bast=(empty=y,pend=n),
pending=(conv=n,lock=n,cancel=n,unlock=n)</div>
<div>Feb 25 22:02:46 prod152 kernel: converting queue:</div>
<div>Feb 25 22:02:46 prod152 kernel: blocked queue:</div>
<div>Feb 25 22:02:46 prod152 kernel: ----------- [cut here ]
--------- [please bite here ] ---------</div>
<div>Feb 25 22:02:46 prod152 kernel: Kernel BUG at
.../build/BUILD/ocfs2-1.4.10/fs/ocfs2/dlm/dlmmaster.c:1819</div>
<div>Feb 25 22:02:46 prod152 kernel: invalid opcode: 0000 [1]
SMP</div>
<div>Feb 25 22:02:46 prod152 kernel: last sysfs file:
/block/cciss!c0d0/cciss!c0d0p1/stat</div>
<div>Feb 25 22:02:46 prod152 kernel: CPU 5</div>
<div>Feb 26 09:50:27 prod152 syslogd 1.4.1: restart.</div>
<div> </div>
<div>The OCFS2 rpm versions used are as below,</div>
<div> </div>
<div>[root@prod152 ~]# uname -r</div>
<div>2.6.18-308.1.1.el5</div>
<div> </div>
<div>[root@prod152 ~]# rpm -qa| grep ocfs</div>
<div>ocfs2-2.6.18-308.1.1.el5xen-1.4.10-1.el5</div>
<div>ocfs2-tools-devel-1.6.3-2.el5</div>
<div>ocfs2-2.6.18-308.1.1.el5-1.4.10-1.el5</div>
<div>ocfs2-tools-debuginfo-1.6.3-2.el5</div>
<div>ocfs2-tools-1.6.3-2.el5</div>
<div>ocfs2console-1.6.3-2.el5</div>
<div>ocfs2-2.6.18-308.1.1.el5debug-1.4.10-1.el5</div>
<div> </div>
<div>root@prod152 ~]# cat /etc/ocfs2/cluster.conf</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.150</div>
<div> number = 0</div>
<div> name = prod150</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.151</div>
<div> number = 1</div>
<div> name = prod151</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.152</div>
<div> number = 2</div>
<div> name = prod152</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.153</div>
<div> number = 3</div>
<div> name = prod153</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.106</div>
<div> number = 4</div>
<div> name = prod106</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.107</div>
<div> number = 5</div>
<div> name = prod107</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.155</div>
<div> number = 6</div>
<div> name = prod155</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.156</div>
<div> number = 7</div>
<div> name = prod156</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.157</div>
<div> number = 8</div>
<div> name = prod157</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.158</div>
<div> number = 9</div>
<div> name = prod158</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.51</div>
<div> number = 10</div>
<div> name = prod51</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.52</div>
<div> number = 11</div>
<div> name = prod52</div>
<div> cluster = ocfs2</div>
<div>node:</div>
<div> ip_port = 7777</div>
<div> ip_address = 10.10.10.154</div>
<div> number = 12</div>
<div> name = prod154</div>
<div> cluster = ocfs2</div>
<div>cluster:</div>
<div> node_count =13</div>
<div> name = ocfs2</div>
<div>[root@prod152 ~]#</div>
<div> </div>
<div>Is this a known issue ? Any issues in the configuration ?</div>
<div> </div>
<div>Thanks,</div>
<div> </div>
<div>Ravi</div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
</span></font>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Ocfs2-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Ocfs2-users@oss.oracle.com">Ocfs2-users@oss.oracle.com</a>
<a class="moz-txt-link-freetext" href="https://oss.oracle.com/mailman/listinfo/ocfs2-users">https://oss.oracle.com/mailman/listinfo/ocfs2-users</a></pre>
</blockquote>
<br>
</body>
</html>