[Ocfs2-tools-devel] [PATCH 2/2] looped chain - Break loop in fsck

tristan tristan.ye at oracle.com
Thu Mar 4 17:28:47 PST 2010


Goldwyn Rodrigues wrote:
> Hi Tristan,
>
> On Wed, Mar 3, 2010 at 8:45 PM, tristan <tristan.ye at oracle.com> wrote:
>   
>> Goldwyn Rodrigues wrote:
>>     
>>> Hi Tristan,
>>>
>>>
>>> On Tue, Mar 2, 2010 at 9:13 PM, tristan <tristan.ye at oracle.com> wrote:
>>>
>>>       
>>>> Hi Goldwyn,
>>>>
>>>> Comments are inlined:
>>>>
>>>> Firstly, please check the patch by /kernel-src/script/checkpatch.pl
>>>> your-patch to detect the coding issues at the very beginning, since I
>>>> found
>>>> some coding style problems when apply your patch:
>>>>
>>>> Applying looped chain - Break loop in fsck
>>>> .dotest/patch:57: trailing whitespace.
>>>> ret = ocfs2_write_group_desc(ost->ost_fs,
>>>> .dotest/patch:90: trailing whitespace.
>>>> ". Break the loop?",cs->cs_chain_no,
>>>> .dotest/patch:94: trailing whitespace.
>>>> }
>>>> .dotest/patch:115: trailing whitespace.
>>>> verbosef("cl cpg %u bpc %u count %u next %u gds %u max_ch_len %u\n",
>>>> warning: 4 lines add whitespace errors.
>>>>
>>>>
>>>> As for the coding style of ocfs2 src, please refer to:
>>>>
>>>> http://www.kernel.org/doc/Documentation/CodingStyle
>>>>
>>>>         
>>> I understand. will take care in the future.
>>>
>>>
>>>       
>>>> Goldwyn Rodrigues wrote:
>>>>
>>>>         
>>>>> Detects a loop by checking hops against the theoretical limit of
>>>>> number of chains in a chain_rec. If a loop is found, it breaks it by
>>>>> storing the block numbers, and comparing with existing block numbers.
>>>>>
>>>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn at suse.de>
>>>>> Acked-by: Mark Fasheh <mfasheh at suse.com>
>>>>>
>>>>> --- diff --git a/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>> b/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>> index 05561ae..798a1e4 100644
>>>>> --- a/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>> +++ b/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>> @@ -202,6 +202,15 @@ valid in its bitmap.
>>>>>  Answering yes decreases the number of recorded free bits so that it
>>>>> equals
>>>>>  the total number of bits in the group descriptor's bitmap.
>>>>>  +.SS "CHAIN_LOOP"
>>>>> +A chain may loop if the next field of the group descriptor points to
>>>>> one
>>>>> of
>>>>> +the previous group descriptors in the chain. This causes the ocfs2
>>>>> code,
>>>>> both
>>>>> +user space and kernel module to loop forever.
>>>>> +
>>>>> +Answering yes breaks the loop at an optimum location so that all the
>>>>> existing
>>>>> +group descriptors are in the chain. However, it cannot re-connect stray
>>>>> group
>>>>> +descriptors and must rely on the rest of the fsck code to fix it.
>>>>> +
>>>>>  .SS "CHAIN_COUNT"
>>>>>  The chain list embedded in an inode is limited by the block size and
>>>>> the
>>>>>  number of bytes consumed by the rest of the inode.  A chain list header
>>>>> was
>>>>> diff --git a/fsck.ocfs2/pass0.c b/fsck.ocfs2/pass0.c
>>>>> index a32bd18..1b59cc1 100644
>>>>> --- a/fsck.ocfs2/pass0.c
>>>>> +++ b/fsck.ocfs2/pass0.c
>>>>> @@ -455,6 +455,46 @@ out:
>>>>>       return ret;
>>>>>  }
>>>>>  +static errcode_t break_loop(o2fsck_state *ost, struct ocfs2_chain_rec
>>>>> *chain,
>>>>> +               unsigned int max_depth)
>>>>>
>>>>>
>>>>>           
>>>> I guess it will make more sense if your rename the func name as
>>>> break_chain_loop() :)
>>>>
>>>>         
>>>>> +{
>>>>> +       uint64_t *list;
>>>>> +       int i;
>>>>> +       unsigned int depth = 0;
>>>>> +       uint64_t blkno = chain->c_blkno;
>>>>>
>>>>>
>>>>>           
>>>> blkno = le64_to_cpu(chain->c_blkno);
>>>>
>>>>
>>>>         
>>>>> +       char *buf;
>>>>> +       struct ocfs2_group_desc *gd;
>>>>> +       errcode_t ret = ocfs2_malloc0(sizeof(uint64_t) * max_depth,
>>>>> &list);
>>>>>
>>>>>
>>>>>           
>>>> I recognized all what you do here is to locate the index where loop
>>>> happens,
>>>> and break the loop by terminating the 'next' to NULL. I loved the way
>>>> your
>>>> treating the loop chain, but I don't like the method you locate a the
>>>> position where loops happening.
>>>>
>>>>
>>>> You see, you used an extra array to record 'depth' of all previous nodes,
>>>> what's more, you also have to come back for checking each previous nodes
>>>> each step, that means your space complexity is O(N), and time complexity
>>>> here is O(N2), that's not so pleasant to me:)
>>>>
>>>>
>>>>
>>>>         
>>>>> +       if (ret)
>>>>> +               goto out;
>>>>> +       ret =  ocfs2_malloc_block(ost->ost_fs->fs_io, &buf);
>>>>> +       if (ret)
>>>>> +               goto out;
>>>>> +       gd = (struct ocfs2_group_desc *)buf;
>>>>> +
>>>>> +       while (blkno && (depth<=max_depth)) {
>>>>> +               list[depth++] = blkno;
>>>>> +               ret = ocfs2_read_group_desc(ost->ost_fs, blkno, buf);
>>>>> +               if (ret)
>>>>> +                       goto out;
>>>>> +               blkno = gd->bg_next_group;
>>>>> +               for (i=0; i<depth; i++)
>>>>> +                       if (list[i]==blkno) {
>>>>> +                               gd->bg_next_group = 0;
>>>>> +                               verbosef("Breaking gd loop %"PRIu64"\n",
>>>>> blkno);
>>>>> +                               ret =
>>>>> ocfs2_write_group_desc(ost->ost_fs,
>>>>> +                                             blkno, buf);
>>>>> +                               goto out;
>>>>> +                       }
>>>>> +       }
>>>>>
>>>>>
>>>>>           
>>>> To detect and locate a loop in a list, you may have a better solution,
>>>> use 2
>>>> pointers, one is 'slow', and the other is 'fast', 'slow' pointer proceed
>>>> the
>>>> list by one hop, and 'fast' one will go two hops each step, so the
>>>> essence
>>>> of this alogrithm is, the fast pointer will meet slow pointer somehow if
>>>> there is a loop exists, otherwise, fast and slow never meet, and end up
>>>> with
>>>> NULLs respectively. the code skeleton will be as follows:
>>>>
>>>> static unit64_t get_group_desc_next(o2fsck_state *ost, uint64_t blkno,
>>>> char
>>>> *buf)
>>>> {
>>>> uint64_t next = 0;
>>>> struct ocfs2_group_desc *gd;
>>>>
>>>> ret = ocfs2_read_group_desc(ost->ost_fs, blkno, buf);
>>>> if (ret)
>>>> return next;
>>>>
>>>> gd = (struct ocfs2_group_desc *)buf;
>>>> next = gd->gb_next_group;
>>>>
>>>> return next;
>>>> }
>>>>
>>>>
>>>> in function break_chain_loop():
>>>>
>>>> uint64_t loop_blkno, blkno, slow, fast;
>>>>
>>>> blkno = slow = fast = le64_to_cpu(chain->c_blkno);
>>>>
>>>> while (fast && get_group_desc_next(ost, fast, buf)) {
>>>>
>>>>       /*
>>>>        * fast ptr meet the slow means a loop happening there
>>>>        */
>>>>       if ((fast == slow) || (get_group_desc_next(ost, fast, buf) ==
>>>> slow))
>>>> {
>>>>               /*
>>>>                * Loop happened at the node before fast and slow
>>>>                */
>>>>               if (fast == slow)
>>>>                       loop_blkno = blkno;
>>>>
>>>>               /*
>>>>                * Loop happend at the node of fast
>>>>                */
>>>>               if (get_group_desc_next(ost, fast, buf) == slow)
>>>>                       loop_blkno = fast;
>>>>
>>>>               ret = ocfs2_read_group_desc(ost->ost_fs, loop_blkno, buf);
>>>>               if (ret)
>>>>                       goto bail;
>>>>
>>>>               gd = (struct ocfs2_group_desc *)buf;
>>>>
>>>>               gd->bg_next_group = 0;
>>>>
>>>>               verbose("Breaking gd loop %"PRIu64"\n", loop_blkno);
>>>>
>>>>               ret = ocfs2_write_group_desc(ost->ost_fs, loop_blkno, buf);
>>>>               if (ret)
>>>>                       goto bail;
>>>>
>>>>               break;
>>>>       } else {
>>>>               blkno = slow;
>>>>
>>>>               /*
>>>>                * Slow ptr goes one by one hop each time.
>>>>                */
>>>>               slow = get_group_desc_next(ost, slow, buf);
>>>>
>>>>               /*
>>>>                * Fast ptr proceeds by 2 hops each step.
>>>>                */
>>>>               fast = get_group_desc_next(ost, fast, buf);
>>>>               fast = get_group_desc_next(ost, fast, buf);
>>>>       }
>>>>
>>>>
>>>> }
>>>>
>>>> ...
>>>>
>>>>
>>>> According to above algorithm, we easily found that the space complexity
>>>> is
>>>> 0, and time complexity was declined to O(N).
>>>>
>>>>
>>>>
>>>>
>>>>         
>>> While this algorithm is efficient in diagnosing the loop, I think it
>>> fails in  detecting the optimum place to break the loop.
>>>
>>> Could you run this algorithm with the following scenario:
>>>
>>>
>>> cr->c_blkno = 1
>>>
>>> gd(1)->next=2
>>> gd(2)->next=3
>>> gd(3)->next=4
>>> gd(4)->next=2
>>>
>>> Where would the breakage be in this case?
>>>
>>>       
>> Oh, you're right, the current algorithm is not capable of detecting the
>> first position where a pointing-back happens, it needs to be extented:
>>
>>
>> int looped = 0;
>> uint64_t blkno, slow, fast;
>>
>> slow = fast = chain->c_blkno;
>>
>> while (fast && get_group_desc_next(ost, fast, buf)) {
>>        /*
>>         * fast ptr meet the slow means a loop happening there
>>         */
>>        if (fast == slow) {
>>                looped = 1;
>>                break;
>>        } else {
>>
>>                /*
>>                 * Slow ptr goes one by one hop each time.
>>                 */
>>                slow = get_group_desc_next(ost, slow, buf);
>>
>>                /*
>>                 * Fast ptr proceeds by 2 hops each step.
>>                 */
>>                fast = get_group_desc_next(ost, fast, buf);
>>                fast = get_group_desc_next(ost, fast, buf);
>>        }
>> }
>>
>> if (looped) {
>>   blkno = chain->c_blkno;
>>   /*
>>    * Find out the optimum to break the loop
>>    */
>>   while (blkno != slow) {
>>      blkno = get_group_desc_next(ost, blkno, buf);
>>      slow = get_group_desc_next(ost, blkno, buf);
>>   }
>>
>>     
>
> You are creating one infinite loop to break another ;)
> If blkno!=slow before the loop, this loop will never finish, except
> the case of a 1-node loop.
> Did you run it through the case I presented yesterday?
>   

Oh, I'm afraid it would work, that's a well-known(for its space and time 
efficiency) algorithm to solve the loop in a singly linked list, it does 
terminate the 'while' when 'blkno == slow', and that's also the optimal 
node where the breakage was expected. for its mathematics principle, you 
can Google for help;)


Tristan.

>
>   
>>   ret = ocfs2_read_group_desc(ost->ost_fs, blkno, buf);
>>   if (ret)
>>      goto bail;
>>
>>   gd = (struct ocfs2_group_desc *)buf;
>>   gd->bg_next_group = 0;
>>   verbose("Breaking gd loop %"PRIu64"\n", blkno);
>>   ret = ocfs2_write_group_desc(ost->ost_fs, blkno, buf);
>>   if (ret)
>>      goto bail;
>> }
>>
>>
>> Though the above algorithm looks a little bit complicated and obscure, I
>> still love it, especially if you're operating with a HUGE volume(16TB+), and
>> even worse, the loop happens at the end of group chain, think about this
>> case. you do benefit from the space and time saved from above method.
>>
>>     
>
>
>
> Regards,
>
>   




More information about the Ocfs2-tools-devel mailing list