[Ocfs2-users] High Load Average

Jeronimo Bezerra jab at ufba.br
Tue Dec 16 15:42:35 PST 2008


Hi Sunil, thanks for your answer.

I use packages from Debian apt, and there is not new version of kernel 
package :(. And I intend in this moment only solve this problem to turn 
on my server again. What could I do? Is there anything in this moment I 
can do?

Another question: Can I upgrade my kernel just overwriting the actual 
image? Is the any chance for crash my ocfs2 file system? Can I have two 
server with different kernel versions?

Thanks for your attention,

Jeronimo

Sunil Mushran escreveu:
> 2.6.18 is a very old release. I would recommend upgrading to kernel
> 2.6.21 or later.
>
> Jerônimo Bezerra wrote:
>> Hello all,
>>
>> I have a scenario here with two Debian 4.0 servers, kernel 
>> 2.6.18-4-amd64, and ocfs2-tools 1.2.1-1.3.
>> These two servers have 16 CPU (4 x Dual Core x HT) and 8GB RAM, with 
>> shared storage with qla2340 in a IBM DS4500 Storage.
>>
>> Everything was working fine until yesterday at morning, when for some 
>> unknown reason, the load average of both servers became too high, 
>> almost 200. CPU utilization, on both, was 16-18%, and memory using 
>> 7GB, uptime of 22 days. Disk I/0 using at least 3 MB/s. Pings to 
>> crossover interface (heartbeat) normally, no packet loss.
>>
>> I use these servers as a mail server, and nobody could connect to 
>> servers because (I think) the high load average.
>>
>> Well, I reboot both servers, and after boot, same thing: in question 
>> of minutes the load average was 150. But one interesting thing:
>> when I shutdown the server A, the server B worked fine! If I turn on 
>> server A and shutdown server B, high load average on A. So, as I 
>> shutdown the server A and the things gone fine, I keep the server A 
>> down for 8 hours. At afternoon, I turned on again, and, surprise, 
>> high load on both servers when OCFS2 started. I had to shutdown both 
>> servers and turn on just server B to established again. At night, I 
>> turned on the server A to try to discovery what's going on. I let 
>> both servers turned on all night ( server A with no service and 
>> server B working normally), and when I arrived at morning today, 
>> another surprise: the load average of server B was on 1200(!) and 
>> server A 0 (no service running).
>>
>> When I started services on server A and shutdown server B, the load 
>> on server A became 200 in question of seconds.
>>
>> I again shutdown the server A, and after that, turned on server B. 
>> Now everything is working fine, load average of 3 on server B.
>>
>> I didn't update the kernel, Debian, storage or anything else. There's 
>> no message on syslog, dmesg or screen. There's no process with more 
>> then 2% of CPU or memory. I really don't know what to do and I have 
>> no clues.
>>
>> Please, could someone help me?
>>
>> Thanks a log
>>
>> Jeronimo
>>
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>   
>




More information about the Ocfs2-users mailing list