[Ocfs2-devel] DLM_IVLOCKID result in file removed failed, issue calls for talks.

Guozhonghua guozhonghua at h3c.com
Sat Oct 25 00:06:07 PDT 2014


Hi, All,

We test the OCFS2 and there is one issue report.
We remove one file or move it with another name in the path mounted with OCFS2 device, one node request the EX type node of inode 531, orphan_dir:0011 file.
But the owner cannot find the lock from its local lock list, so it return DLM_IVLOCKID and the file remove operation failed.

We remove the file on node cvk12, and its debug log recorded in syslog.

Oct 24 19:09:07 cvk12 kernel: [259621.224260] (rm,17362,4):ocfs2_should_refresh_lock_res:2155 status 0
Oct 24 19:09:07 cvk12 kernel: [259621.224280] (rm,17362,4):ocfs2_inode_lock_full_nested:2298 inode 1740044, take EXMODE META lock
Oct 24 19:09:07 cvk12 kernel: [259621.224283] (rm,17362,4):ocfs2_should_refresh_lock_res:2155 status 0
Oct 24 19:09:07 cvk12 kernel: [259621.224290] (rm,17362,4):ocfs2_inode_lock_full_nested:2298 inode 531, take EXMODE META lock
Oct 24 19:09:07 cvk12 kernel: [259621.224293] (rm,17362,4):__ocfs2_cluster_lock:1465 lockres M000000000000000000021320c6ec4e, convert from 3 to 5
Oct 24 19:09:07 cvk12 kernel: [259621.224297] (rm,17362,4):dlmconvert_remote:267 type=3, convert_type=-1, busy=0
Oct 24 19:09:07 cvk12 kernel: [259621.224301] (rm,17362,4):dlm_send_remote_convert_request:353 M000000000000000000021320c6ec4e
Oct 24 19:09:07 cvk12 kernel: [259621.224337] (rm,17362,4):o2net_send_message_vec:1179 [mag 64085 len 80 typ 504 stat 0 sys_stat 0 key eec5710a num 0] sending returned 0
Oct 24 19:09:07 cvk12 kernel: [259621.224834] (swapper/0,0,0):o2net_data_ready:626 [sc ffff88061bc34c00 refs 4 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] data_ready hit
Oct 24 19:09:07 cvk12 kernel: [259621.224842] (swapper/0,0,0):sc_get:427 [sc ffff88061bc34c00 refs 4 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] get
Oct 24 19:09:07 cvk12 kernel: [259621.224877] (kworker/u129:0,43499,1):o2net_advance_rx:1422 [sc ffff88061bc34c00 refs 5 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] receiving
Oct 24 19:09:07 cvk12 kernel: [259621.224895] (kworker/u129:0,43499,1):o2net_advance_rx:1471 [mag 64086 len 0 typ 504 stat 11 sys_stat 0 key eec5710a num 0] at page_off 24
Oct 24 19:09:07 cvk12 kernel: [259621.224898] (kworker/u129:0,43499,1):o2net_process_message:1258 [mag 64086 len 0 typ 504 stat 11 sys_stat 0 key eec5710a num 0] processing message
Oct 24 19:09:07 cvk12 kernel: [259621.224901] (kworker/u129:0,43499,1):sc_put:421 [sc ffff88061bc34c00 refs 5 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 24] put
Oct 24 19:09:07 cvk12 kernel: [259621.224910] (kworker/u129:0,43499,1):sc_get:427 [sc ffff88061bc34c00 refs 4 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 24] get
Oct 24 19:09:07 cvk12 kernel: [259621.224919] (kworker/u129:0,43499,1):o2net_advance_rx:1497 [sc ffff88061bc34c00 refs 5 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] ret = 1
Oct 24 19:09:07 cvk12 kernel: [259621.224923] (kworker/u129:0,43499,1):o2net_advance_rx:1422 [sc ffff88061bc34c00 refs 5 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] receiving
Oct 24 19:09:07 cvk12 kernel: [259621.224926] (kworker/u129:0,43499,1):o2net_advance_rx:1497 [sc ffff88061bc34c00 refs 5 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] ret = -11
Oct 24 19:09:07 cvk12 kernel: [259621.224929] (kworker/u129:0,43499,1):sc_put:421 [sc ffff88061bc34c00 refs 5 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] put
Oct 24 19:09:07 cvk12 kernel: [259621.224949] (rm,17362,4):o2net_send_message_vec:1199 woken, returning system status 0, user status 11
Oct 24 19:09:07 cvk12 kernel: [259621.224954] (rm,17362,4):sc_put:421 [sc ffff88061bc34c00 refs 4 sock ffff880619128a00 node 1 page ffffea0017a4ee00 pg_off 0] put
Oct 24 19:09:07 cvk12 kernel: [259621.224957] (rm,17362,4):dlm_send_remote_convert_request:388 ERROR: dlm status = DLM_IVLOCKID
Oct 24 19:09:07 cvk12 kernel: [259621.224965] (rm,17362,4):dlmconvert_remote:322 ERROR: dlm status = DLM_IVLOCKID
Oct 24 19:09:07 cvk12 kernel: [259621.224970] (rm,17362,4):__ocfs2_cluster_lock:1479 ERROR: DLM error -22 while calling ocfs2_dlm_lock on resource M000000000000000000021320c6ec4e
Oct 24 19:09:07 cvk12 kernel: [259621.224975] (rm,17362,4):ocfs2_inode_lock_full_nested:2326 ERROR: status = -22
Oct 24 19:09:07 cvk12 kernel: [259621.224980] (rm,17362,4):ocfs2_lookup_lock_orphan_dir:1890 ERROR: status = -22
Oct 24 19:09:07 cvk12 kernel: [259621.224984] (rm,17362,4):ocfs2_prepare_orphan_dir:1953 ERROR: status = -22
Oct 24 19:09:07 cvk12 kernel: [259621.224996] (rm,17362,4):ocfs2_unlink:870 ERROR: status = -22
Oct 24 19:09:07 cvk12 kernel: [259621.225001] (rm,17362,4):ocfs2_inode_unlock:2475 inode 1740044 drop EXMODE META lock
Oct 24 19:09:07 cvk12 kernel: [259621.225003] (rm,17362,4):ocfs2_inode_unlock:2475 inode 513 drop EXMODE META lock
Oct 24 19:09:07 cvk12 kernel: [259621.225006] (rm,17362,4):ocfs2_unlink:948 ERROR: status = -22

Another node cvknode66 response the reqeust.
It cannot find the lock and return DLM_IVLOCKID, so as removing operation failed.

Oct 24 19:09:01 cvknode66 CRON[4228]: (CRON) info (No MTA installed, discarding output)
Oct 24 19:09:04 cvknode66 kernel: [375654.198219] (kworker/u129:2,55238,17):dlm_convert_lock_handler:485 ERROR: did not find lock to convert on grant queue! cookie=2:14
Oct 24 19:09:04 cvknode66 kernel: [375654.198228] lockres: M000000000000000000021320c6ec4e, owner=1, state=0
Oct 24 19:09:04 cvknode66 kernel: [375654.198242]   last used: 0, refcnt: 26, on purge list: no
Oct 24 19:09:04 cvknode66 kernel: [375654.198243]   on dirty list: no, on reco list: no, migrating pending: no
Oct 24 19:09:04 cvknode66 kernel: [375654.198244]   inflight locks: 0, asts reserved: 0
Oct 24 19:09:04 cvknode66 kernel: [375654.198245]   refmap nodes: [ 2 3 8 11 12 13 14 15 16 17 18 20 21 22 23 26 27 72 109 110 112 115 116 117 118 119 120 ], inflight=0
Oct 24 19:09:04 cvknode66 kernel: [375654.198256]   granted queue:
Oct 24 19:09:04 cvknode66 kernel: [375654.198259]     type=3, conv=-1, node=1, cookie=1:833, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198260] 050000010000000000000000000000001511ffffff9409ffffffc00000001511ffffffe5ffffffe9ffffffce483f601511ffffffe5ffffffe9ffffffce483f600000000000000f3841ffffffed00020000000020ffffffc6ffffffec4e00000000
Oct 24 19:09:04 cvknode66 kernel: [375654.198283]     type=3, conv=-1, node=8, cookie=8:8541, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198286]     type=3, conv=-1, node=17, cookie=17:2421, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198288]     type=3, conv=-1, node=23, cookie=23:3170, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198290]     type=3, conv=-1, node=21, cookie=21:2909, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198292]     type=3, conv=-1, node=11, cookie=11:14, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198294]     type=3, conv=-1, node=20, cookie=20:3086, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198296]     type=3, conv=-1, node=13, cookie=13:2911, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198297]     type=3, conv=-1, node=15, cookie=15:2875, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198299]     type=3, conv=-1, node=26, cookie=26:2357, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198301]     type=3, conv=-1, node=72, cookie=72:993, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198303]     type=3, conv=-1, node=27, cookie=27:3056, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198305]     type=3, conv=-1, node=119, cookie=119:67, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198307]     type=3, conv=-1, node=110, cookie=110:3669, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198309]     type=3, conv=-1, node=115, cookie=115:2585, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198311]     type=3, conv=-1, node=118, cookie=118:2393, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198313]     type=3, conv=-1, node=109, cookie=109:2849, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198315]     type=3, conv=-1, node=22, cookie=22:2820, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198316]     type=3, conv=-1, node=12, cookie=12:2636, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198318]     type=3, conv=-1, node=112, cookie=112:600, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198320] 050000010000000000000000000000001511ffffff9409ffffffc00000001511ffffffe5ffffffe9ffffffce483f601511ffffffe5ffffffe9ffffffce483f600000000000000f3841ffffffed00020000000020ffffffc6ffffffec4e00000000
Oct 24 19:09:04 cvknode66 kernel: [375654.198342]     type=3, conv=-1, node=14, cookie=14:2424, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198343] 050000010000000000000000000000001511ffffff9409ffffffc00000001511ffffffe5ffffffe9ffffffce483f601511ffffffe5ffffffe9ffffffce483f600000000000000f3841ffffffed00020000000020ffffffc6ffffffec4e00000000
Oct 24 19:09:04 cvknode66 kernel: [375654.198365]     type=3, conv=-1, node=120, cookie=120:968, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198366] 050000010000000000000000000000001511ffffff9409ffffffc00000001511ffffffe5ffffffe9ffffffce483f601511ffffffe5ffffffe9ffffffce483f600000000000000f3841ffffffed00020000000020ffffffc6ffffffec4e00000000
Oct 24 19:09:04 cvknode66 kernel: [375654.198388]     type=3, conv=-1, node=3, cookie=3:3395, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198389] 050000010000000000000000000000001511ffffff9409ffffffc00000001511ffffffe5ffffffe9ffffffce483f601511ffffffe5ffffffe9ffffffce483f600000000000000f3841ffffffed00020000000020ffffffc6ffffffec4e00000000
Oct 24 19:09:04 cvknode66 kernel: [375654.198411]     type=3, conv=-1, node=16, cookie=16:2256, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Oct 24 19:09:04 cvknode66 kernel: [375654.198412] 050000010000000000000000000000001511ffffff9409ffffffc00000001511ffffffe5ffffffe9ffffffce483f601511ffffffe5ffffffe9ffffffce483f600000000000000f3841ffffffed00020000000020ffffffc6ffffffec4e00000000
Oct 24 19:09:04 cvknode66 kernel: [375654.198446]   converting queue:
Oct 24 19:09:04 cvknode66 kernel: [375654.198447]   blocked queue:

We reviewed the code, and think that the DLM message may be lost in the cluster which cause the operation failed.
The reason that lock info lost on the owner, we did not know.
Is there any other way to avoid the issue, or why this scenario created?

Thanks a lot.


-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!


More information about the Ocfs2-devel mailing list