[Ocfs2-users] Dlm question
Charlie Sharkey
charlie.sharkey at bustech.com
Fri Feb 1 08:00:54 PST 2008
Hi,
I'm having some dlm issues on a system. It looks like
the scenario went something like:
19:45:29 --> 19:47:31 16 dlm locks are released using
o2dlm_unlock().
ocfs2 logs an error into /var/log/messages,
but returns
ok to the application
19:45:15 a dlm lock (o2dlm_lock()) is put on P00000
-- ok
19:49:37 lock on P00000 is released -- ok
19:49:40 a lock is attempted P00000. and the lock
fails. Returned error
is "Trylock failed"
Here is the data from /var/log/messages:
Jan 31 19:45:29 N1 kernel: (25033,1):dlmfs_unlink:512 ERROR: unlink
P50005, error -16 from destroy
Jan 31 19:45:43 N1 kernel: (25038,1):dlmfs_unlink:512 ERROR: unlink
P20010, error -16 from destroy
Jan 31 19:45:44 N1 kernel: (25030,1):dlmfs_unlink:512 ERROR: unlink
P20002, error -16 from destroy
Jan 31 19:45:59 N1 kernel: (25034,3):dlmfs_unlink:512 ERROR: unlink
P60006, error -16 from destroy
Jan 31 19:46:07 N1 kernel: (25043,0):dlmfs_unlink:512 ERROR: unlink
P70015, error -16 from destroy
Jan 31 19:46:08 N1 kernel: (25035,0):dlmfs_unlink:512 ERROR: unlink
P70007, error -16 from destroy
Jan 31 19:46:10 N1 kernel: (25041,1):dlmfs_unlink:512 ERROR: unlink
P50013, error -16 from destroy
Jan 31 19:46:25 N1 kernel: (25040,1):dlmfs_unlink:512 ERROR: unlink
P40012, error -16 from destroy
Jan 31 19:46:30 N1 kernel: (25042,1):dlmfs_unlink:512 ERROR: unlink
P60014, error -16 from destroy
Jan 31 19:46:30 N1 kernel: (25032,1):dlmfs_unlink:512 ERROR: unlink
P40004, error -16 from destroy
Jan 31 19:47:07 N1 kernel: (25028,1):dlmfs_unlink:512 ERROR: unlink
P00000, error -16 from destroy
Jan 31 19:47:08 N1 kernel: (25036,1):dlmfs_unlink:512 ERROR: unlink
P00008, error -16 from destroy
Jan 31 19:47:09 N1 kernel: (25029,1):dlmfs_unlink:512 ERROR: unlink
P10001, error -16 from destroy
Jan 31 19:47:19 N1 kernel: (25037,0):dlmfs_unlink:512 ERROR: unlink
P10009, error -16 from destroy
Jan 31 19:47:30 N1 kernel: (25039,1):dlmfs_unlink:512 ERROR: unlink
P30011, error -16 from destroy
Jan 31 19:47:31 N1 kernel: (25031,1):dlmfs_unlink:512 ERROR: unlink
P30003, error -16 from destroy
Here is data from the application dlm log file
01/31/2008 19:42:50 C000: Dlm Lock fd/id 150/P00000, returning: ok
01/31/2008 19:42:50 C001: Dlm Lock fd/id 152/P10001, returning: ok
01/31/2008 19:42:50 C002: Dlm Lock fd/id 154/P20002, returning: ok
01/31/2008 19:42:51 C003: Dlm Lock fd/id 156/P30003, returning: ok
01/31/2008 19:42:51 C004: Dlm Lock fd/id 158/P40004, returning: ok
01/31/2008 19:42:52 C005: Dlm Lock fd/id 160/P50005, returning: ok
01/31/2008 19:42:52 C006: Dlm Lock fd/id 162/P60006, returning: ok
01/31/2008 19:42:52 C007: Dlm Lock fd/id 164/P70007, returning: ok
01/31/2008 19:42:53 C008: Dlm Lock fd/id 166/P00008, returning: ok
01/31/2008 19:42:53 C009: Dlm Lock fd/id 168/P10009, returning: ok
01/31/2008 19:42:53 C00A: Dlm Lock fd/id 170/P20010, returning: ok
01/31/2008 19:42:54 C00B: Dlm Lock fd/id 172/P30011, returning: ok
01/31/2008 19:42:54 C00C: Dlm Lock fd/id 174/P40012, returning: ok
01/31/2008 19:42:54 C00D: Dlm Lock fd/id 178/P50013, returning: ok
01/31/2008 19:42:55 C00E: Dlm Lock fd/id 180/P60014, returning: ok
01/31/2008 19:42:58 C00F: Dlm Lock fd/id 182/P70015, returning: ok
01/31/2008 19:45:29 C005: Dlm UnLock. fd/id 160/P50005, returning ok
01/31/2008 19:45:43 C00A: Dlm UnLock. fd/id 170/P20010, returning ok
01/31/2008 19:45:44 C002: Dlm UnLock. fd/id 154/P20002, returning ok
01/31/2008 19:45:59 C006: Dlm UnLock. fd/id 162/P60006, returning ok
01/31/2008 19:46:07 C00F: Dlm UnLock. fd/id 182/P70015, returning ok
01/31/2008 19:46:08 C007: Dlm UnLock. fd/id 164/P70007, returning ok
01/31/2008 19:46:10 C00D: Dlm UnLock. fd/id 178/P50013, returning ok
01/31/2008 19:46:25 C00C: Dlm UnLock. fd/id 174/P40012, returning ok
01/31/2008 19:46:30 C00E: Dlm UnLock. fd/id 180/P60014, returning ok
01/31/2008 19:46:30 C004: Dlm UnLock. fd/id 158/P40004, returning ok
01/31/2008 19:47:07 C000: Dlm UnLock. fd/id 150/P00000, returning ok
01/31/2008 19:47:08 C008: Dlm UnLock. fd/id 166/P00008, returning ok
01/31/2008 19:47:09 C001: Dlm UnLock. fd/id 152/P10001, returning ok
01/31/2008 19:47:19 C009: Dlm UnLock. fd/id 168/P10009, returning ok
01/31/2008 19:47:30 C00B: Dlm UnLock. fd/id 172/P30011, returning ok
01/31/2008 19:47:31 C003: Dlm UnLock. fd/id 156/P30003, returning ok
01/31/2008 19:49:15 C000: Dlm Lock fd/id 150/P00000, returning: ok
01/31/2008 19:49:37 C000: Dlm UnLock. fd/id 150/P00000, returning ok
01/31/2008 19:49:40 C000: Dlm Lock fd/id 150/P00000, returning: Trylock
failed
I also had a problem with this system the day before this. Here is
The data from that:
SYSTEM MAP: /boot/System.map-2.6.16.46-0.14.PTF.284042.0-smp
DEBUG KERNEL: ../vmlinux.debug (2.6.16.46-0.14.PTF.284042.0-smp)
DUMPFILE: vmcore
CPUS: 4
DATE: Wed Jan 30 17:44:49 2008
UPTIME: 9 days, 00:55:51
LOAD AVERAGE: 1.10, 1.06, 1.01
TASKS: 341
NODENAME: N1
RELEASE: 2.6.16.46-0.14.PTF.284042.0-smp
VERSION: #1 SMP Thu May 17 14:00:09 UTC 2007
MACHINE: i686 (2327 Mhz)
MEMORY: 2 GB
PANIC: "kernel BUG at fs/ocfs2/dlm/dlmmaster.c:2780!"
PID: 31585
COMMAND: "masx"
TASK: dab912d0 [THREAD_INFO: d5840000]
CPU: 0
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 31585 TASK: dab912d0 CPU: 0 COMMAND: "masx"
#0 [d5841d78] crash_kexec at c013bb1a
#1 [d5841dbc] die at c01055fe
#2 [d5841dec] do_invalid_op at c0105ce2
#3 [d5841e9c] error_code (via invalid_op) at c0104e4d
EAX: 00000051 EBX: ca668280 ECX: 00000000 EDX: 00000296 EBP:
da0c1c00
DS: 007b ESI: da0c1c00 ES: 007b EDI: ca668280
CS: 0060 EIP: fb835e95 ERR: ffffffff EFLAGS: 00010296
#4 [d5841ed0] dlm_empty_lockres at fb835e95
#5 [d5841ee0] dlm_unregister_domain at fb827305
#6 [d5841f18] dlmfs_clear_inode at fb6c2eae
#7 [d5841f24] clear_inode at c0175dfe
#8 [d5841f30] generic_delete_inode at c0175eee
#9 [d5841f3c] iput at c0175838
#10 [d5841f48] dput at c01744e0
#11 [d5841f54] do_rmdir at c016e63d
#12 [d5841fb8] sysenter_entry at c0103bd4
EAX: 00000028 EBX: 08299988 ECX: 00000000 EDX: 08273be4
DS: 007b ESI: 00000000 ES: 007b EDI: 082ebf28
SS: 007b ESP: bf999f7c EBP: bf999fa8
CS: 0073 EIP: ffffe410 ERR: 00000028 EFLAGS: 00000246
This is a Suse Sles10 SP1 system, with a suse nfs patch.
Ocfs2 tools version 1.2.3-0.7
Ocfs2 version 1.2.5-SLES-r2997
I was hoping you would have some ideas on this.
Also, another question. I have been trying to run one of the debugging
Scripts, for example, scanlocks. I keep getting the message 'Module
debugfs
not loaded'. I don't see any debugfs.ko on the system. Isn't it a part
of
The ocfs2 tools ?
Thank you,
Charlie
More information about the Ocfs2-users
mailing list