[Ocfs2-users] Ftp server... single file seems locked
Sunil Mushran
sunil.mushran at oracle.com
Fri Apr 2 10:01:17 PDT 2010
If "fs_locks -B" is empty, then the processes are not waiting on a
cluster lock.
Process pegged at 100% cpu means it is actively waiting to acquire a
spinlock.
Is the other process running?
Unfortunately in EL5 there is no clean way to get the kernel stack for a
process.
"echo t >/proc/sysrq-trigger" is the only way but it might take the box
if there
are a lot of processes. If you do that, ensure netconsole is setup. The
stack trace
should tell us more.
Sunil
Jason Price wrote:
> As another note, the process that's trying to read the file is in a
> VERY busy wait state... it's taking all the CPU it can get. STRACE
> doesn't show any output when I try to connect to the process.
>
> --Jason
>
> On Fri, Apr 2, 2010 at 12:44 PM, Jason Price <japrice at gmail.com> wrote:
>
>> To add further information:
>>
>> 1) Note A:
>> # cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state
>> Domain: 6D419D86AE8A4DB1940788EDDA27027B Key: 0xc955c1d5
>> Thread Pid: 3869 Node: 1 State: JOINED
>> Number of Joins: 1 Joining Node: 255
>> Domain Map: 1 2
>> Live Map: 1 2
>> Lock Resources: 70731 (442210)
>> MLEs: 0 (1048380)
>> Blocking: 0 (647669)
>> Mastery: 0 (400711)
>> Migration: 0 (0)
>> Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty
>> Purge Count: 0 Refs: 70732
>> Dead Node: 255
>> Recovery Pid: 3870 Master: 255 State: INACTIVE
>> Recovery Map:
>> Recovery Node State:
>>
>> Node B:
>> # cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state
>> Domain: 6D419D86AE8A4DB1940788EDDA27027B Key: 0xc955c1d5
>> Thread Pid: 3757 Node: 2 State: JOINED
>> Number of Joins: 1 Joining Node: 255
>> Domain Map: 1 2
>> Live Map: 1 2
>> Lock Resources: 48113 (50521)
>> MLEs: 0 (85510)
>> Blocking: 0 (35121)
>> Mastery: 0 (50389)
>> Migration: 0 (0)
>> Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty
>> Purge Count: 0 Refs: 48114
>> Dead Node: 255
>> Recovery Pid: 3758 Master: 255 State: INACTIVE
>> Recovery Map:
>> Recovery Node State:
>>
>> There are no busy locks apparently, as shown by
>>
>> # debugfs.ocfs2 -R "fs_locks -B" /dev/sda1
>> #
>>
>> I am unable to kill any of these processes, even with kill -9.
>>
>> # cat /etc/ocfs2/cluster.conf
>> cluster:
>> node_count = 2
>> name = ocfs2ftpcluster
>>
>> node:
>> ip_port = 7777
>> ip_address = 192.168.0.1
>> number = 1
>> name = prtftp01
>> cluster = ocfs2ftpcluster
>>
>> node:
>> ip_port = 7777
>> ip_address = 192.168.0.2
>> number = 2
>> name = prtftp02
>> cluster = ocfs2ftpcluster
>>
>> If you'd like the output of :
>>
>> # debugfs.ocfs2 -R "fs_locks" /dev/sda1 | wc -l
>> 768681
>>
>> I can give it, but it's a lot output.
>>
>> --Jason
>>
>> On Fri, Apr 2, 2010 at 11:38 AM, Jason Price <japrice at gmail.com> wrote:
>>
>>> I'm setting up an HA ftp server (amongst other services).
>>>
>>> When two connections happen simultaneously, and (more specifically) the same user from two IP's attempt to access the same file (one for reading, and one for writing), the processes both hang. And all subsequent attempts to either read or write the file fail.
>>>
>>> The two processes that seem to have caused the lock:
>>> user 24139 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - ::ffff:xxx.yyy.0.253: RETR prim_wo_img_dom.obs
>>> user 24142 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - ::ffff:xxx.yyy.103.208: STOR prim_wo_img_dom.obs
>>>
>>> (there are 49 other process trying to do the same things, but these are the first ones.)
>>>
>>> I'm more than happy to provide any information needed on this issue:
>>>
>>> OSL
>>> CentOS release 5.4 (Final)
>>>
>>> uname -a:
>>> Linux prtftp01<omitted> 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> ocfs2 version 1.4.4
>>>
>>> At the moment, only one host is actively serving FTP at any time. I can fail the services back and forth as needed.
>>>
>>> --Jason
>>>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
More information about the Ocfs2-users
mailing list