[Ocfs2-users] High Load Average
Sunil Mushran
sunil.mushran at oracle.com
Tue Dec 16 15:56:20 PST 2008
Debian etch is 2.6.24 based.
Jeronimo Bezerra wrote:
> Hi Sunil, thanks for your answer.
>
> I use packages from Debian apt, and there is not new version of kernel
> package :(. And I intend in this moment only solve this problem to turn
> on my server again. What could I do? Is there anything in this moment I
> can do?
>
> Another question: Can I upgrade my kernel just overwriting the actual
> image? Is the any chance for crash my ocfs2 file system? Can I have two
> server with different kernel versions?
>
> Thanks for your attention,
>
> Jeronimo
>
> Sunil Mushran escreveu:
>
>> 2.6.18 is a very old release. I would recommend upgrading to kernel
>> 2.6.21 or later.
>>
>> Jerônimo Bezerra wrote:
>>
>>> Hello all,
>>>
>>> I have a scenario here with two Debian 4.0 servers, kernel
>>> 2.6.18-4-amd64, and ocfs2-tools 1.2.1-1.3.
>>> These two servers have 16 CPU (4 x Dual Core x HT) and 8GB RAM, with
>>> shared storage with qla2340 in a IBM DS4500 Storage.
>>>
>>> Everything was working fine until yesterday at morning, when for some
>>> unknown reason, the load average of both servers became too high,
>>> almost 200. CPU utilization, on both, was 16-18%, and memory using
>>> 7GB, uptime of 22 days. Disk I/0 using at least 3 MB/s. Pings to
>>> crossover interface (heartbeat) normally, no packet loss.
>>>
>>> I use these servers as a mail server, and nobody could connect to
>>> servers because (I think) the high load average.
>>>
>>> Well, I reboot both servers, and after boot, same thing: in question
>>> of minutes the load average was 150. But one interesting thing:
>>> when I shutdown the server A, the server B worked fine! If I turn on
>>> server A and shutdown server B, high load average on A. So, as I
>>> shutdown the server A and the things gone fine, I keep the server A
>>> down for 8 hours. At afternoon, I turned on again, and, surprise,
>>> high load on both servers when OCFS2 started. I had to shutdown both
>>> servers and turn on just server B to established again. At night, I
>>> turned on the server A to try to discovery what's going on. I let
>>> both servers turned on all night ( server A with no service and
>>> server B working normally), and when I arrived at morning today,
>>> another surprise: the load average of server B was on 1200(!) and
>>> server A 0 (no service running).
>>>
>>> When I started services on server A and shutdown server B, the load
>>> on server A became 200 in question of seconds.
>>>
>>> I again shutdown the server A, and after that, turned on server B.
>>> Now everything is working fine, load average of 3 on server B.
>>>
>>> I didn't update the kernel, Debian, storage or anything else. There's
>>> no message on syslog, dmesg or screen. There's no process with more
>>> then 2% of CPU or memory. I really don't know what to do and I have
>>> no clues.
>>>
>>> Please, could someone help me?
>>>
>>> Thanks a log
>>>
>>> Jeronimo
>>>
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
More information about the Ocfs2-users
mailing list