<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<tt>On 12/12/2010 11:58 PM, frank wrote:</tt>
<blockquote cite="mid:4D05D219.9010109@si.ct.upc.edu" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<tt> After that, all node operations frozen; we can not log in
either.<br>
<br>
Node 0 keep on log this kind of messages until it stopped
"message" logging at 10:49: </tt> <tt><br>
<br>
<i>Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):ocfs2_inode_lock_full:2121 ERROR: status =
-22<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):_ocfs2_statfs:1266 ERROR: status = -22<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):dlm_send_remote_convert_request:393 ERROR:
dlm status = DLM_IVLOCKID<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):dlmconvert_remote:327 ERROR: dlm status =
DLM_IVLOCKID<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):ocfs2_cluster_lock:1258 ERROR: DLM error
DLM_IVLOCKID while calling dlmlock on resource M00000000<br>
0000000000000b6f931666: bad lockid</i></tt></blockquote>
<br>
Node 0 is trying to upconvert the lock level.<br>
<br>
<blockquote cite="mid:4D05D219.9010109@si.ct.upc.edu" type="cite"> <tt>Node
1 keep on log this kind of messages until it stopped "message"
logging at 10:00: </tt> <tt><br>
<br>
<i>Dec 4 10:00:20 parmenides kernel:
(o2net,10545,14):dlm_convert_lock_handler:489 ERROR: did not
find lock to convert on grant queue! cookie=0:6<br>
Dec 4 10:00:20 parmenides kernel: lockres:
M000000000000000000000b6f931666, owner=1, state=0<br>
Dec 4 10:00:20 parmenides kernel: last used: 0, refcnt: 4,
on purge list: no<br>
Dec 4 10:00:20 parmenides kernel: on dirty list: no, on
reco list: no, migrating pending: no<br>
Dec 4 10:00:20 parmenides kernel: inflight locks: 0, asts
reserved: 0<br>
Dec 4 10:00:20 parmenides kernel: refmap nodes: [ 0 ],
inflight=0<br>
Dec 4 10:00:20 parmenides kernel: granted queue:<br>
Dec 4 10:00:20 parmenides kernel: type=5, conv=-1,
node=1, cookie=1:6, ref=2, ast=(empty=y,pend=n),
bast=(empty=y,pend=n), pending=(conv=n,lock=n<br>
,cancel=n,unlock=n)<br>
Dec 4 10:00:20 parmenides kernel: converting queue:<br>
Dec 4 10:00:20 parmenides kernel: type=0, conv=3, node=0,
cookie=0:6, ref=2, ast=(empty=y,pend=n),
bast=(empty=y,pend=n), pending=(conv=n,lock=n,<br>
cancel=n,unlock=n)<br>
Dec 4 10:00:20 parmenides kernel: blocked queue:</i></tt> <tt><br>
</tt></blockquote>
<br>
Node 1 does not find that lock in the granted queue because that
lock is in the<br>
converting queue. Do you have the very first error message on both
nodes<br>
relating to this resource?<br>
<br>
Also, this is definitely a system object. Can you list the system
directory?<br>
# debugfs.ocfs2 -R "ls -l //" /dev/sdX<br>
<br>
<blockquote cite="mid:4D05D219.9010109@si.ct.upc.edu" type="cite"><tt><br>
We reboot both nodes at 13:03, and we recovered services as
usual with no more problems.</tt> <tt><br>
</tt></blockquote>
<br>
</body>
</html>