[Ocfs2-tools-devel] Patch for journal truncate of ocfs2-tools.

tao.ma tao.ma at oracle.com
Mon May 14 17:44:56 PDT 2007


Sunil Mushran wrote:
> chain_cpg is not the size of the device. It just indicates
> the number of clusters per group. If a vol has only one
> cluster group, then mkfs sets it to the number of actual
> clusters in that group and not the max possible.
>
> Only during resize did we realize that it is easier to set
> it to the max value. But setting it less than max should
> not be considered a bug. Yes, we could "fix" mkfs to always
> do that but we still would have to take care of this edge
> condition.
>
> Bottomline, chain_cpg fix should not be thought of as a
> bug nor a corruption.
OK, so I will use "-fy" for a new volume in this script after Marcos 
confirm he met with the same thing as me.
>
> tao.ma wrote:
>> Sunil Mushran wrote:
>>> Yes,  that's correct.
>>>
>>> One solution is to expect this in your test script.
>>> As in, run with -fy to clean this up after a new mkfs.
>> Run with "-fy"? Then fsck.ocfs2 will change the chain_cpg to the max, 
>> but that isn't the volume's real size. Is it really OK? If yes, maybe 
>> we should modify mkfs not fsck.ocfs2 since it is amazing we can find 
>> an error for a new formatted volume.
>>
>> Conerning this script, maybe I should check the volume size before I 
>> begin the whole test and ask the user to use a volume large enough?
>> Anyway, this may be done after Marcos confirmed the error he met is 
>> the same as mine.
>>
>> Marcos, 20G isn't enough. See it? ;)
>>>
>>> tao.ma wrote:
>>>> Sunil Mushran wrote:
>>>>> The first fsck is not called with -fy. See function normal_test.
>>>>> But, if your volume was 12G, we should not encounter the
>>>>> chain_cpg issue at all.
>>>> 12G isn't enough. Let me calculate it.
>>>> If you use 4K as the block size and 1M as the cluster size, then in 
>>>> every group there will be 32256((4096-64)*8) bits and that means 
>>>> about 30G.
>>>> See the debugfs.ocfs2 output of my volume.
>>>>
>>>> debugfs:        Inode: 519   Mode: 0644   Generation: 3351252320 
>>>> (0xc7c00d60)
>>>>        FS Generation: 3351252320 (0xc7c00d60)
>>>>        Type: Regular   Attr: 0x0   Flags: Valid System Allocbitmap 
>>>> Chain
>>>>        User: 0 (root)   Group: 0 (root)   Size: 40007368704
>>>>        Links: 1   Clusters: 38154
>>>>        ctime: 0x46484d33 -- Mon May 14 07:51:15 2007
>>>>        atime: 0x46484d33 -- Mon May 14 07:51:15 2007
>>>>        mtime: 0x46484d33 -- Mon May 14 07:51:15 2007
>>>>        dtime: 0x0 -- Wed Dec 31 19:00:00 1969
>>>>        ctime_nsec: 0x00000000 -- 0
>>>>        atime_nsec: 0x00000000 -- 0
>>>>        mtime_nsec: 0x00000000 -- 0
>>>>        Last Extblk: 0
>>>>        Sub Alloc Slot: Global   Sub Alloc Bit: 7
>>>>        Bitmap Total: 38154   Used: 976   Free: 37178
>>>>        Clusters per Group: 32256   Bits per Cluster: 1
>>>>        Count: 243   Next Free Rec: 2
>>>>        ##   Total        Used         Free         Block#
>>>>        0    32256        975          31281        256
>>>>        1    5898         1            5897         8257536
>>>>
>>>>        Group Chain: 0   Parent Inode: 519  Generation: 3351252320
>>>>        ##   Block#            Total    Used     Free     Contig   Size
>>>>        0    256               32256    975      31281    15871    4032
>>>>
>>>>        Group Chain: 1   Parent Inode: 519  Generation: 3351252320
>>>>        ##   Block#            Total    Used     Free     Contig   Size
>>>>        0    8257536           5898     1        5897     5897     4032
>>>>
>>>>>
>>>>> Marcos, Use "set -x" to narrow down the problem area
>>>>> in the test script.
>>>>>
>>>>> Marcos E. Matsunaga wrote:
>>>>>> Sunil,
>>>>>>
>>>>>> Volume was 12Gb. Fsck is only called with -fy options.
>>>>>>
>>>>>> I'm running a test with 20Gb Volume. Hopefully it is big enough.
>>>>>>
>>>>>> Sunil Mushran wrote:
>>>>>>> chain_cpg is not a "corruption" per se eventhough fsck
>>>>>>> treats and fixes it as one. That mkfs sets cpg to < max
>>>>>>> possible when the device fits in one cluster group is not
>>>>>>> ideal but not incorrect either.
>>>>>>>
>>>>>>> Is running fsck -fy the first time round not possible? It
>>>>>>> will take care of this problem.
>>>> Use "-y" will fix the chain_cpg to the max, but that isn't the 
>>>> volume's real size.
>>>>>>>
>>>>>>> Marcos, what was the size of the volume and the parameters
>>>>>>> passed to mkfs. I want to be sure that the problem you
>>>>>>> encountered is the same that Tao is referring to.
>>>>>>>
>>>>>>> tao.ma wrote:
>>>>>>>> Marcos E. Matsunaga wrote:
>>>>>>>>> Tao,
>>>>>>>>>
>>>>>>>>> Sorry about the late. I ran into some problems and finally got 
>>>>>>>>> to run some tests today with tunefs-test.sh and the truncate 
>>>>>>>>> program. The tunefs seems to be working fine.  Didn't find any 
>>>>>>>>> unexpected problem with it.
>>>>>>>>>
>>>>>>>>> With the truncate, I started the script. At first it was error 
>>>>>>>>> because it doesn't have the test_truncate binary set in the 
>>>>>>>>> script. I did that, and it started and I let it run. After a 
>>>>>>>>> few hours running, it showed nothing at all, like it was 
>>>>>>>>> frozen. Hitting enter where it was running, showed an fsck 
>>>>>>>>> error. I tried again and it does the same soon after I hit 
>>>>>>>>> enter. I don't know if that's the expected behavior, but I 
>>>>>>>>> found a little weird.
>>>>>>>> I found the same problem as you when I ran it the first time. ;)
>>>>>>>> It may because that your test volume is too small. When we have 
>>>>>>>> blocksize=4K and clustersize=1M(maybe smaller, it depends on 
>>>>>>>> the real size of your volume) then you may have only one group 
>>>>>>>> descriptor in your "//global_bitmap". So after the volume is 
>>>>>>>> formatted, fsck.ocfs2 will find an error of "CHAIN_CPG"(You may 
>>>>>>>> try it by using -b 4K -C 1M to format your volume and run 
>>>>>>>> fsck.ocfs2 immediatly after your format to check whether this 
>>>>>>>> problem exists). I haven't added the option "-y" in fsck.ocfs2 
>>>>>>>> since a new-formatted volume shouldn't have any errors, and 
>>>>>>>> that causes the program waiting for the input and looks like 
>>>>>>>> freeze. So this problem isn't concerned with ocfs2_truncate and 
>>>>>>>> you may use a larger volume(40G is enough) and have another try.
>>>>>>>>
>>>>>>>> Sunil, I saw your comments in the function 
>>>>>>>> "maybe_fix_clusters_per_group" about this problem, and I 
>>>>>>>> remembered that  you added it for the offline resize. So we 
>>>>>>>> have to answer "n" for this output in fsck.ocfs2, right?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Ocfs2-tools-devel mailing list
>>>>>>>> Ocfs2-tools-devel at oss.oracle.com
>>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>>>>>>>
>>>>>>
>>>>>> -- 
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Marcos Eduardo Matsunaga
>>>>>>
>>>>>> Oracle USA
>>>>>> Linux Engineering
>>>>>>
>>>>>>
>>>>>>   
>>>>>> ------------------------------------------------------------------------ 
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ocfs2-tools-devel mailing list
>>>>>> Ocfs2-tools-devel at oss.oracle.com
>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ocfs2-tools-devel mailing list
>>>>> Ocfs2-tools-devel at oss.oracle.com
>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Ocfs2-tools-devel mailing list
>> Ocfs2-tools-devel at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>


-- 
* **     Tao Ma
*     Member of Techincal Staff *

Oracle Asia Research & Development Center
Open Source Technologies Development
*
Tel:        +86 10 8278 6026
Mobile:   +86 13701237602         
URL:       OARDC Intranet <http://cdc.oraclecorp.com/>, Oracle.com/cdc 
<http://www.oracle.com/cdc/>



More information about the Ocfs2-tools-devel mailing list