[Ocfs2-users] High Load Average
Sunil Mushran
sunil.mushran at oracle.com
Tue Dec 16 13:58:49 PST 2008
2.6.18 is a very old release. I would recommend upgrading to kernel
2.6.21 or later.
Jerônimo Bezerra wrote:
> Hello all,
>
> I have a scenario here with two Debian 4.0 servers, kernel
> 2.6.18-4-amd64, and ocfs2-tools 1.2.1-1.3.
> These two servers have 16 CPU (4 x Dual Core x HT) and 8GB RAM, with
> shared storage with qla2340 in a IBM DS4500 Storage.
>
> Everything was working fine until yesterday at morning, when for some
> unknown reason, the load average of both servers became too high, almost
> 200. CPU utilization, on both, was 16-18%, and memory using 7GB, uptime
> of 22 days. Disk I/0 using at least 3 MB/s. Pings to crossover interface
> (heartbeat) normally, no packet loss.
>
> I use these servers as a mail server, and nobody could connect to
> servers because (I think) the high load average.
>
> Well, I reboot both servers, and after boot, same thing: in question of
> minutes the load average was 150. But one interesting thing:
> when I shutdown the server A, the server B worked fine! If I turn on
> server A and shutdown server B, high load average on A. So, as I
> shutdown the server A and the things gone fine, I keep the server A down
> for 8 hours. At afternoon, I turned on again, and, surprise, high load
> on both servers when OCFS2 started. I had to shutdown both servers and
> turn on just server B to established again. At night, I turned on the
> server A to try to discovery what's going on. I let both servers turned
> on all night ( server A with no service and server B working normally),
> and when I arrived at morning today, another surprise: the load average
> of server B was on 1200(!) and server A 0 (no service running).
>
> When I started services on server A and shutdown server B, the load on
> server A became 200 in question of seconds.
>
> I again shutdown the server A, and after that, turned on server B. Now
> everything is working fine, load average of 3 on server B.
>
> I didn't update the kernel, Debian, storage or anything else. There's no
> message on syslog, dmesg or screen. There's no process with more then 2%
> of CPU or memory. I really don't know what to do and I have no clues.
>
> Please, could someone help me?
>
> Thanks a log
>
> Jeronimo
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
More information about the Ocfs2-users
mailing list