<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Al 13/12/10 20:49, En/na Sunil Mushran ha escrit:
<blockquote cite="mid:4D0678CC.10506@oracle.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<tt>On 12/12/2010 11:58 PM, frank wrote:</tt>
<blockquote cite="mid:4D05D219.9010109@si.ct.upc.edu" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<tt> After that, all node operations frozen; we can not log in
either.<br>
<br>
Node 0 keep on log this kind of messages until it stopped
"message" logging at 10:49: </tt> <tt><br>
<br>
<i>Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):ocfs2_inode_lock_full:2121 ERROR: status
= -22<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):_ocfs2_statfs:1266 ERROR: status = -22<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):dlm_send_remote_convert_request:393
ERROR: dlm status = DLM_IVLOCKID<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):dlmconvert_remote:327 ERROR: dlm status =
DLM_IVLOCKID<br>
Dec 4 10:49:34 heraclito kernel:
(sendmail,19074,6):ocfs2_cluster_lock:1258 ERROR: DLM error
DLM_IVLOCKID while calling dlmlock on resource M00000000<br>
0000000000000b6f931666: bad lockid</i></tt></blockquote>
<br>
Node 0 is trying to upconvert the lock level.<br>
<br>
<blockquote cite="mid:4D05D219.9010109@si.ct.upc.edu" type="cite">
<tt>Node 1 keep on log this kind of messages until it stopped
"message" logging at 10:00: </tt> <tt><br>
<br>
<i>Dec 4 10:00:20 parmenides kernel:
(o2net,10545,14):dlm_convert_lock_handler:489 ERROR: did not
find lock to convert on grant queue! cookie=0:6<br>
Dec 4 10:00:20 parmenides kernel: lockres:
M000000000000000000000b6f931666, owner=1, state=0<br>
Dec 4 10:00:20 parmenides kernel: last used: 0, refcnt:
4, on purge list: no<br>
Dec 4 10:00:20 parmenides kernel: on dirty list: no, on
reco list: no, migrating pending: no<br>
Dec 4 10:00:20 parmenides kernel: inflight locks: 0, asts
reserved: 0<br>
Dec 4 10:00:20 parmenides kernel: refmap nodes: [ 0 ],
inflight=0<br>
Dec 4 10:00:20 parmenides kernel: granted queue:<br>
Dec 4 10:00:20 parmenides kernel: type=5, conv=-1,
node=1, cookie=1:6, ref=2, ast=(empty=y,pend=n),
bast=(empty=y,pend=n), pending=(conv=n,lock=n<br>
,cancel=n,unlock=n)<br>
Dec 4 10:00:20 parmenides kernel: converting queue:<br>
Dec 4 10:00:20 parmenides kernel: type=0, conv=3,
node=0, cookie=0:6, ref=2, ast=(empty=y,pend=n),
bast=(empty=y,pend=n), pending=(conv=n,lock=n,<br>
cancel=n,unlock=n)<br>
Dec 4 10:00:20 parmenides kernel: blocked queue:</i></tt>
<tt><br>
</tt></blockquote>
<br>
Node 1 does not find that lock in the granted queue because that
lock is in the<br>
converting queue. Do you have the very first error message on both
nodes<br>
relating to this resource?<br>
</blockquote>
Here they are:<br>
<br>
Node 0:<br>
<tt>Dec 4 09:15:06 heraclito kernel: o2net: connection to node
parmenides (num 1) at 192.168.1.2:7777 has been idle for 30.0
seconds, shutting it down.<br>
Dec 4 09:15:06 heraclito kernel:
(swapper,0,7):o2net_idle_timer:1503 here are some times that might
help debug the situation: (tmr 1291450476.228826 <br>
now 1291450506.229456 dr 1291450476.228760 adv
1291450476.228842:1291450476.228843 func (de6e01eb:500)
1291450476.228827:1291450476.228829)<br>
Dec 4 09:15:06 heraclito kernel: o2net: no longer connected to
node parmenides (num 1) at 192.168.1.2:7777<br>
Dec 4 09:15:06 heraclito kernel:
(vzlist,22622,7):dlm_send_remote_convert_request:395 ERROR: status
= -112<br>
Dec 4 09:15:06 heraclito kernel:
(snmpd,16452,10):dlm_send_remote_convert_request:395 ERROR: status
= -112<br>
Dec 4 09:15:06 heraclito kernel:
(snmpd,16452,10):dlm_wait_for_node_death:370
0D3E49EB1F614A3EAEC0E2A74A34AFFF: waiting 5000ms for notification
of de<br>
ath of node 1<br>
Dec 4 09:15:06 heraclito kernel:
(httpd,4615,10):dlm_do_master_request:1334 ERROR: link to 1 went
down!<br>
Dec 4 09:15:06 heraclito kernel:
(httpd,4615,10):dlm_get_lock_<a class="moz-txt-link-freetext" href="resource:917">resource:917</a> ERROR: status = -112<br>
Dec 4 09:15:06 heraclito kernel:
(python,20750,10):dlm_do_master_request:1334 ERROR: link to 1 went
down!<br>
Dec 4 09:15:06 heraclito kernel:
(python,20750,10):dlm_get_lock_<a class="moz-txt-link-freetext" href="resource:917">resource:917</a> ERROR: status = -112<br>
Dec 4 09:15:06 heraclito kernel:
(vzlist,22622,7):dlm_wait_for_node_death:370
0D3E49EB1F614A3EAEC0E2A74A34AFFF: waiting 5000ms for notification
of de<br>
ath of node 1<br>
Dec 4 09:15:06 heraclito kernel: o2net: accepted connection from
node parmenides (num 1) at 192.168.1.2:7777<br>
Dec 4 09:15:11 heraclito kernel:
(snmpd,16452,5):dlm_send_remote_convert_request:393 ERROR: dlm
status = DLM_IVLOCKID<br>
Dec 4 09:15:11 heraclito kernel:
(snmpd,16452,5):dlmconvert_remote:327 ERROR: dlm status =
DLM_IVLOCKID<br>
Dec 4 09:15:11 heraclito kernel:
(snmpd,16452,5):ocfs2_cluster_lock:1258 ERROR: DLM error
DLM_IVLOCKID while calling dlmlock on resource M00000000000<br>
0000000000b6f931666: bad lockid</tt><br>
<br>
Node 1:<br>
<tt>Dec 4 09:15:06 parmenides kernel: o2net: connection to node
heraclito (num 0) at 192.168.1.3:7777 has been idle for 30.0
seconds, shutting it down.<br>
Dec 4 09:15:06 parmenides kernel:
(swapper,0,9):o2net_idle_timer:1503 here are some times that might
help debug the situation: (tmr 1291450476.231519<br>
now 1291450506.232462 dr 1291450476.231506 adv
1291450476.231522:1291450476.231522 func (de6e01eb:505)
1291450475.650496:1291450475.650501)<br>
Dec 4 09:15:06 parmenides kernel: o2net: no longer connected to
node heraclito (num 0) at 192.168.1.3:7777<br>
Dec 4 09:15:06 parmenides kernel:
(snmpd,12342,11):dlm_do_master_request:1334 ERROR: link to 0 went
down!<br>
Dec 4 09:15:06 parmenides kernel:
(minilogd,12700,0):dlm_wait_for_lock_mastery:1117 ERROR: status =
-112<br>
Dec 4 09:15:06 parmenides kernel:
(smbd,25555,12):dlm_do_master_request:1334 ERROR: link to 0 went
down!<br>
Dec 4 09:15:06 parmenides kernel:
(python,12439,9):dlm_do_master_request:1334 ERROR: link to 0 went
down!<br>
Dec 4 09:15:06 parmenides kernel:
(python,12439,9):dlm_get_lock_<a class="moz-txt-link-freetext" href="resource:917">resource:917</a> ERROR: status = -112<br>
Dec 4 09:15:06 parmenides kernel:
(smbd,25555,12):dlm_get_lock_<a class="moz-txt-link-freetext" href="resource:917">resource:917</a> ERROR: status = -112<br>
Dec 4 09:15:06 parmenides kernel:
(minilogd,12700,0):dlm_do_master_request:1334 ERROR: link to 0
went down!<br>
Dec 4 09:15:06 parmenides kernel:
(minilogd,12700,0):dlm_get_lock_<a class="moz-txt-link-freetext" href="resource:917">resource:917</a> ERROR: status = -107<br>
Dec 4 09:15:06 parmenides kernel:
(dlm_thread,10627,4):dlm_drop_lockres_ref:2211 ERROR: status =
-112<br>
Dec 4 09:15:06 parmenides kernel:
(dlm_thread,10627,4):dlm_purge_lockres:206 ERROR: status = -112<br>
Dec 4 09:15:06 parmenides kernel: o2net: connected to node
heraclito (num 0) at 192.168.1.3:7777<br>
Dec 4 09:15:06 parmenides kernel:
(snmpd,12342,11):dlm_get_lock_<a class="moz-txt-link-freetext" href="resource:917">resource:917</a> ERROR: status = -112<br>
Dec 4 09:15:11 parmenides kernel:
(o2net,10545,6):dlm_convert_lock_handler:489 ERROR: did not find
lock to convert on grant queue! cookie=0:6<br>
Dec 4 09:15:11 parmenides kernel: lockres:
M000000000000000000000b6f931666, owner=1, state=0<br>
Dec 4 09:15:11 parmenides kernel: last used: 0, refcnt: 4, on
purge list: no<br>
Dec 4 09:15:11 parmenides kernel: on dirty list: no, on reco
list: no, migrating pending: no<br>
Dec 4 09:15:11 parmenides kernel: inflight locks: 0, asts
reserved: 0<br>
Dec 4 09:15:11 parmenides kernel: refmap nodes: [ 0 ],
inflight=0<br>
Dec 4 09:15:11 parmenides kernel: granted queue:<br>
Dec 4 09:15:11 parmenides kernel: type=5, conv=-1, node=1,
cookie=1:6, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n),
pending=(conv=n,lock=n<br>
,cancel=n,unlock=n)<br>
Dec 4 09:15:11 parmenides kernel: converting queue:<br>
Dec 4 09:15:11 parmenides kernel: type=0, conv=3, node=0,
cookie=0:6, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n),
pending=(conv=n,lock=n,<br>
cancel=n,unlock=n)<br>
Dec 4 09:15:11 parmenides kernel: blocked queue:</tt><br>
<br>
<blockquote cite="mid:4D0678CC.10506@oracle.com" type="cite"> <br>
Also, this is definitely a system object. Can you list the system
directory?<br>
# debugfs.ocfs2 -R "ls -l //" /dev/sdX<br>
<br>
</blockquote>
<tt># debugfs.ocfs2 -R "ls -l //" /dev/mapper/mpath2<br>
6 drwxr-xr-x 4 0 0 3896
19-Oct-2010 08:42 .<br>
6 drwxr-xr-x 4 0 0 3896
19-Oct-2010 08:42 ..<br>
7 -rw-r--r-- 1 0 0 0
19-Oct-2010 08:42 bad_blocks<br>
8 -rw-r--r-- 1 0 0 831488
19-Oct-2010 08:42 global_inode_alloc<br>
9 -rw-r--r-- 1 0 0 4096
19-Oct-2010 08:47 slot_map<br>
10 -rw-r--r-- 1 0 0 1048576
19-Oct-2010 08:42 heartbeat<br>
11 -rw-r--r-- 1 0 0 2199023255552
19-Oct-2010 08:42 global_bitmap<br>
12 drwxr-xr-x 2 0 0 12288
14-Dec-2010 08:58 orphan_dir:0000<br>
13 drwxr-xr-x 2 0 0 16384
14-Dec-2010 08:50 orphan_dir:0001<br>
14 -rw-r--r-- 1 0 0 1103101952
19-Oct-2010 08:42 extent_alloc:0000<br>
15 -rw-r--r-- 1 0 0 1103101952
19-Oct-2010 08:42 extent_alloc:0001<br>
16 -rw-r--r-- 1 0 0 14109638656
19-Oct-2010 08:42 inode_alloc:0000<br>
17 -rw-r--r-- 1 0 0 6673137664
19-Oct-2010 08:42 inode_alloc:0001<br>
18 -rw-r--r-- 1 0 0 268435456
19-Oct-2010 08:46 journal:0000<br>
19 -rw-r--r-- 1 0 0 268435456
19-Oct-2010 08:47 journal:0001<br>
20 -rw-r--r-- 1 0 0 0
19-Oct-2010 08:42 local_alloc:0000<br>
21 -rw-r--r-- 1 0 0 0
19-Oct-2010 08:42 local_alloc:0001<br>
22 -rw-r--r-- 1 0 0 0
19-Oct-2010 08:42 truncate_log:0000<br>
23 -rw-r--r-- 1 0 0 0
19-Oct-2010 08:42 truncate_log:0001</tt><br>
<br>
Thanks once more for your help.<br>
Regards.<br>
<br>
Frank<br>
<br>
<br>
<br />--
<br />Aquest missatge ha estat analitzat per
<a href="http://www.mailscanner.info/"><b>MailScanner</b></a>
<br />a la cerca de virus i d'altres continguts perillosos,
<br />i es considera que está net.
</body>
</html>