[Ocfs2-users] High Load Average - New information
Sunil Mushran
sunil.mushran at oracle.com
Wed Dec 17 18:08:14 PST 2008
Not really. It is another issue that could be related to the fact
that the fs is very old.
Jerônimo Bezerra wrote:
> Hello Sunil and all,
>
> I didn't upgrade my kernel yet, but I had a error in server B that could
> help us:
>
> (4205,0):ocfs2_delete_inode:860 ERROR: status = -17
> (4205,0):ocfs2_query_inode_wipe:751 ERROR: status = -17
> (4205,0):ocfs2_delete_inode:860 ERROR: status = -17
> (4205,0):ocfs2_query_inode_wipe:751 ERROR: status = -17
> (4205,0):ocfs2_delete_inode:860 ERROR: status = -17
> (4240,0):ocfs2_query_inode_wipe:744 ERROR: Inode 150165660 (on-disk
> 150165660) not orphaned! Disk flags 0x0, inode flags 0x80
> (4240,0):ocfs2_delete_inode:860 ERROR: status = -17
> (4868,0):ocfs2_query_inode_wipe:744 ERROR: Inode 15173219 (on-disk
> 15173219) not orphaned! Disk flags 0x0, inode flags 0x80
> (4868,0):ocfs2_delete_inode:860 ERROR: status = -17
> (4905,0):ocfs2_query_inode_wipe:744 ERROR: Inode 267696909 (on-disk
> 267696909) not orphaned! Disk flags 0x0, inode flags 0x80
> (4905,0):ocfs2_delete_inode:860 ERROR: status = -17
> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at fs/ocfs2/journal.h:441
> invalid opcode: 0000 [1] SMP
> CPU 0
> Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager
> configfs qla2xxx reiserfs dm_snapshot dm_mirror dm_mod loop joydev
> serio_raw tsdev psmouse evdev pcspkr shpchp floppy pci_hotplug sg ext3
> jbd mbcache ide_cd cdrom usbhid piix sd_mod generic ehci_hcd ide_core
> uhci_hcd firmware_class scsi_transport_fc megaraid_mbox scsi_mod
> megaraid_mm tg3 thermal processor fan
> Pid: 5448, comm: imapd Not tainted 2.6.18-4-amd64 #1
> RIP: 0010:[<ffffffff88279360>] [<ffffffff88279360>]
> :ocfs2:ocfs2_commit_truncate+0x550/0x1537
> RSP: 0018:ffff8101396e5c58 EFLAGS: 00010297
> RAX: 0000000000000000 RBX: ffff8102254020c0 RCX: 0000000000000002
> RDX: 0000000000f30000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: 00000000ffffffff R09: 00000000007cd6d3
> R10: ffff81022567a800 R11: ffffffff8828f423 R12: 0000000000000000
> R13: ffff81017fb2c000 R14: 0000000007cd6d30 R15: ffff8100ca4c04c8
> FS: 00002ac4e7237250(0000) GS:ffffffff80521000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000000065c378 CR3: 0000000082e9e000 CR4: 00000000000006e0
> Process imapd (pid: 5448, threadinfo ffff8101396e4000, task
> ffff8100122e20c0)
> Stack: ffff81017a8aedc0 ffff81003030a7f0 ffff81022567a800 ffff810227fd3b88
> ffff8100ca4c0408 0000000010360648 ffff810000000000 ffff810142ff65b0
> 0000000000000000 ffff81017fb2c000 ffff81017fb2c0c0 ffff8100a2ae9f00
> Call Trace:
> [<ffffffff8828d1d6>] :ocfs2:ocfs2_wipe_inode+0x466/0xb23
> [<ffffffff882a91bc>] :ocfs2:ocfs2_delete_response_cb+0x0/0x17f
> [<ffffffff88290122>] :ocfs2:ocfs2_delete_inode+0x623/0x7b1
> [<ffffffff8828faff>] :ocfs2:ocfs2_delete_inode+0x0/0x7b1
> [<ffffffff8022d395>] generic_delete_inode+0xc6/0x143
> [<ffffffff8828f53a>] :ocfs2:ocfs2_drop_inode+0x117/0x16e
> [<ffffffff8023a1b0>] do_unlinkat+0xd5/0x148
> [<ffffffff802584d6>] system_call+0x7e/0x83
>
>
> Code: 0f 0b 68 f6 d6 2a 88 c2 b9 01 66 85 d2 0f 95 c2 66 ff ce 0f
> RIP [<ffffffff88279360>] :ocfs2:ocfs2_commit_truncate+0x550/0x1537
> RSP <ffff8101396e5c58>
>
>
> And another maybe useful information:
>
> a database server in MS Windows is too slow to write on disk, as my
> server A (it started in the same day). They both are on same IBM Storage
> subsystem and same brocade switch.
>
> Could It help? I know that my kernel is old, but...
>
> Thanks
>
> Jeronimo
>
>
> Sunil Mushran escreveu:
>
>> There is no ocfs2 1.4 for non-enterprise kernels. For all non-ent
>> distros, ocfs2 is part of the kernel. Read the 1.4 user's guide.
>> It explains the development process.
>>
>> You will have to upgrade both node. Make sure they are both
>> running the same kernel/ocfs2.
>>
>> Jeronimo Bezerra wrote:
>>
>>> It seems that my only option is upgrade my kernel package..
>>>
>>> I only find 2.6.24 in this package: linux-image-2.6-amd64-etchnhalf
>>> . I will study it better.
>>>
>>> Well, if I intent to upgrade, what´s your suggestion: upgrade in the
>>> good server (B) or in problematic server (A)? Any chance of a file
>>> system crash?
>>>
>>> My ocfs2-tools: 1.2.1-1.3
>>>
>>> I didn´t find 1.4 on debian apt.
>>>
>>> Thanks,
>>>
>>> Jeronimo
>>>
>>> Citando Sunil Mushran <sunil.mushran at oracle.com>:
>>>
>>>
>>>
>>>> Debian etch is 2.6.24 based.
>>>>
>>>> Jeronimo Bezerra wrote:
>>>>
>>>>
>>>>> Hi Sunil, thanks for your answer.
>>>>>
>>>>> I use packages from Debian apt, and there is not new version of
>>>>> kernel package :(. And I intend in this moment only solve this
>>>>> problem to turn on my server again. What could I do? Is there
>>>>> anything in this moment I can do?
>>>>>
>>>>> Another question: Can I upgrade my kernel just overwriting the
>>>>> actual image? Is the any chance for crash my ocfs2 file system?
>>>>> Can I have two server with different kernel versions?
>>>>>
>>>>> Thanks for your attention,
>>>>>
>>>>> Jeronimo
>>>>>
>>>>> Sunil Mushran escreveu:
>>>>>
>>>>>
>>>>>
>>>>>> 2.6.18 is a very old release. I would recommend upgrading to kernel
>>>>>> 2.6.21 or later.
>>>>>>
>>>>>> Jerônimo Bezerra wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I have a scenario here with two Debian 4.0 servers, kernel
>>>>>>> 2.6.18-4-amd64, and ocfs2-tools 1.2.1-1.3.
>>>>>>> These two servers have 16 CPU (4 x Dual Core x HT) and 8GB RAM,
>>>>>>> with shared storage with qla2340 in a IBM DS4500 Storage.
>>>>>>>
>>>>>>> Everything was working fine until yesterday at morning, when
>>>>>>> for some unknown reason, the load average of both servers
>>>>>>> became too high, almost 200. CPU utilization, on both, was
>>>>>>> 16-18%, and memory using 7GB, uptime of 22 days. Disk I/0 using
>>>>>>> at least 3 MB/s. Pings to crossover interface (heartbeat)
>>>>>>> normally, no packet loss.
>>>>>>>
>>>>>>> I use these servers as a mail server, and nobody could connect
>>>>>>> to servers because (I think) the high load average.
>>>>>>>
>>>>>>> Well, I reboot both servers, and after boot, same thing: in
>>>>>>> question of minutes the load average was 150. But one
>>>>>>> interesting thing:
>>>>>>> when I shutdown the server A, the server B worked fine! If I
>>>>>>> turn on server A and shutdown server B, high load average on A.
>>>>>>> So, as I shutdown the server A and the things gone fine, I keep
>>>>>>> the server A down for 8 hours. At afternoon, I turned on again,
>>>>>>> and, surprise, high load on both servers when OCFS2 started. I
>>>>>>> had to shutdown both servers and turn on just server B to
>>>>>>> established again. At night, I turned on the server A to try to
>>>>>>> discovery what's going on. I let both servers turned on all
>>>>>>> night ( server A with no service and server B working
>>>>>>> normally), and when I arrived at morning today, another
>>>>>>> surprise: the load average of server B was on 1200(!) and
>>>>>>> server A 0 (no service running).
>>>>>>>
>>>>>>> When I started services on server A and shutdown server B, the
>>>>>>> load on server A became 200 in question of seconds.
>>>>>>>
>>>>>>> I again shutdown the server A, and after that, turned on server
>>>>>>> B. Now everything is working fine, load average of 3 on server B.
>>>>>>>
>>>>>>> I didn't update the kernel, Debian, storage or anything else.
>>>>>>> There's no message on syslog, dmesg or screen. There's no
>>>>>>> process with more then 2% of CPU or memory. I really don't know
>>>>>>> what to do and I have no clues.
>>>>>>>
>>>>>>> Please, could someone help me?
>>>>>>>
>>>>>>> Thanks a log
>>>>>>>
>>>>>>> Jeronimo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Ocfs2-users mailing list
>>>>>>> Ocfs2-users at oss.oracle.com
>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> _______________________________________________
>>>>> Ocfs2-users mailing list
>>>>> Ocfs2-users at oss.oracle.com
>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>
>>>>>
>>>>>
>>> ______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
More information about the Ocfs2-users
mailing list