[Ocfs2-users] Weird lock

Erik Terpstra erik at solidcode.net
Thu Apr 10 03:13:06 PDT 2008


We have a similar situation, our system hangs several times a day.
I still can't figure out exactly what's going wrong.
But on 1 node of our system (where Apache runs + a webservice written in 
Ruby (Mongrel/Camping)), the system load keeps rising until it is not 
responding to anything.
Also in the process list there are a lot of processes in D state at that 
time.
The weird thing is that we just discovered that rebooting *another* node 
(we have 4 in total) fixes this situation.
Suddenly the system load on the node that initially had the problem 
returns to a normal level and the processes that were in a D state are 
also returning to their normal states.
Any idea why rebooting another node results fixes this situation? And 
what might be the cause of this?

We are running:

Linux test01 2.6.22-14-server #1 SMP Thu Jan 31 23:57:25 UTC 2008 x86_64 
GNU/Linux

[   77.688875] OCFS2 Node Manager 1.3.3
[   77.703166] OCFS2 DLM 1.3.3
[   77.710731] OCFS2 DLMFS 1.3.3
[   77.710816] OCFS2 User DLM kernel interface loaded
[   85.870956] OCFS2 1.3.3

Kind regards,

Erik.

> Hello,
>
> yes.. when this situation happens there is allways a process spinning (running 
> at 100%cpu). We can't kill it even with kill -9
>   




More information about the Ocfs2-users mailing list