[Ocfs2-users] 答复: remove locks? or copy the whole file?

Aleks Clark aleks.clark at gmail.com
Wed Jul 4 09:00:42 PDT 2012


I found the infinite loop. chain gets down to 69 (lol) and does this forever:

repair_group_desc:363 | checking desc at 2225664; blkno 2225664 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 10063872; blkno 10063872 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 88445952; blkno 88445952 size
4032 bits 32256 free_bits 394 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 80607744; blkno 80607744 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 72769536; blkno 72769536 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 64931328; blkno 64931328 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 57093120; blkno 57093120 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 49254912; blkno 49254912 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 41416704; blkno 41416704 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 33578496; blkno 33578496 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 25740288; blkno 25740288 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 17902080; blkno 17902080 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 2225664; blkno 2225664 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 10063872; blkno 10063872 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 88445952; blkno 88445952 size
4032 bits 32256 free_bits 394 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 80607744; blkno 80607744 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 72769536; blkno 72769536 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 64931328; blkno 64931328 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 57093120; blkno 57093120 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 49254912; blkno 49254912 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 41416704; blkno 41416704 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 33578496; blkno 33578496 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 25740288; blkno 25740288 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 17902080; blkno 17902080 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 2225664; blkno 2225664 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 10063872; blkno 10063872 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 88445952; blkno 88445952 size
4032 bits 32256 free_bits 394 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 80607744; blkno 80607744 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 72769536; blkno 72769536 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 64931328; blkno 64931328 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 57093120; blkno 57093120 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 49254912; blkno 49254912 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 41416704; blkno 41416704 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 33578496; blkno 33578496 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 25740288; blkno 25740288 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 17902080; blkno 17902080 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588
repair_group_desc:363 | checking desc at 2225664; blkno 2225664 size
4032 bits 32256 free_bits 1535 chain 69 generation 1910514588


On Wed, Jul 4, 2012 at 4:49 AM, Aleks Clark <aleks.clark at gmail.com> wrote:
> looks like I got hit by this:
>
> https://oss.oracle.com/pipermail/ocfs2-users/2011-April/005106.html
>
> guess I'll cancel that fsck and upgrade after all :P
>
> On Wed, Jul 4, 2012 at 4:38 AM, Aleks Clark <aleks.clark at gmail.com> wrote:
>> I'll try that kernel upgrade while I've got the cluster down. Has
>> anyone given any thought to multi-threading fsck.ocfs2? From my top
>> stats, it's clearly CPU-bound (also going on 5 hours, still haven't
>> seen the end of the first pass).
>>
>> On Tue, Jul 3, 2012 at 11:50 PM, Guozhonghua <guozhonghua at h3c.com> wrote:
>>>   Hi,
>>>
>>>   I had used the ocfs2 with Linux kernel 2.6.39, there are some problems may be same with you.
>>>
>>>   I download the Linux kernel 3.2.X, and compare the source code with 2.6.39, and find so many codes changed.
>>>   So as to update the kernel and the problems disappeared.
>>>
>>>   I recommend you update the kernel to recent, may be very stable.
>>>   I used the recent kernel and the ocfs2 module is very stable and it had run for several weeks without reboot, panic.
>>>
>>>   Another note, you will set the I/O schedule method with deadline, and it will be fitful for ocfs2.
>>>
>>>   elevator=deadline
>>>
>>>   Please prefer the ocfs2_faq.txt for details:
>>>
>>>   Q07   I encounter "Kernel panic - not syncing: ocfs2 is very sorry to be fencing this system by panicing" whenever I run a heavy io
>>>         load? A07       We have encountered a bug with the default "cfq" io scheduler which causes a process doing heavy io to temporarily starve out
>>>         other processes. While this is not fatal for most environments,
>>>         it is for OCFS2 as we expect the hb thread to be r/w to the hb
>>>         area atleast once every 12 secs (default).
>>>         Bug with the fix has been filed with Red Hat and Novell. For
>>>         more, refer to the tracker bug filed on bugzilla:
>>>         http://oss.oracle.com/bugzilla/show_bug.cgi?id=671
>>>         Till this issue is resolved, one is advised to use the
>>>         "deadline" io scheduler. To use deadline, add "elevator=deadline"
>>>         to the kernel command line as follows:
>>>         1. For SLES9, edit the command line in /boot/grub/menu.lst.
>>>         title Linux 2.6.5-7.244-bigsmp  elevator=deadline kernel (hd0,4)/boot/vmlinuz-2.6.5-7.244-bigsmp root=/dev/sda5 vga=0x314 selinux=0 splash=silent resume=/dev/sda3
>>>                         elevator=deadline showopts console=tty0
>>>                         console=ttyS0,115200 noexec=off initrd (hd0,4)/boot/initrd-2.6.5-7.244-bigsmp
>>>         2. For RHEL4, edit the command line in /boot/grub/grub.conf:
>>>         title Red Hat Enterprise Linux AS (2.6.9-22.EL) root (hd0,0)
>>>                 kernel /vmlinuz-2.6.9-22.EL ro root=LABEL=/ console=ttyS0,115200 console=tty0 elevator=deadline noexec=off initrd /initrd-2.6.9-22.EL.img
>>>         To see the current kernel command line, do:
>>>         # cat /proc/cmdline ==============================================================================
>>> -------------------------------------------------------------------------------------------------------------------------------------
>>> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
>>> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
>>> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
>>> 邮件!
>>> This e-mail and its attachments contain confidential information from H3C, which is
>>> intended only for the person or entity whose address is listed above. Any use of the
>>> information contained herein in any way (including, but not limited to, total or partial
>>> disclosure, reproduction, or dissemination) by persons other than the intended
>>> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
>>> by phone or email immediately and delete it!
>>
>>
>>
>> --
>> Aleks Clark
>
>
>
> --
> Aleks Clark



-- 
Aleks Clark



More information about the Ocfs2-users mailing list