[Ocfs2-tools-devel] [PATCH 2/2] looped chain - Break loop in fsck

Fri Mar 5 05:57:36 PST 2010

Hi Tristan,

On Thu, Mar 4, 2010 at 11:19 PM, tristan <tristan.ye at oracle.com> wrote:
> Goldwyn Rodrigues wrote:
>>
>> Hi Tristan
>>
>> On Thu, Mar 4, 2010 at 7:28 PM, tristan <tristan.ye at oracle.com> wrote:
>>
>>>
>>> Goldwyn Rodrigues wrote:
>>>
>>>>
>>>> Hi Tristan,
>>>>
>>>> On Wed, Mar 3, 2010 at 8:45 PM, tristan <tristan.ye at oracle.com> wrote:
>>>>
>>>>
>>>>>
>>>>> Goldwyn Rodrigues wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> Hi Tristan,
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 2, 2010 at 9:13 PM, tristan <tristan.ye at oracle.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hi Goldwyn,
>>>>>>>
>>>>>>> Comments are inlined:
>>>>>>>
>>>>>>> Firstly, please check the patch by /kernel-src/script/checkpatch.pl
>>>>>>> your-patch to detect the coding issues at the very beginning, since I
>>>>>>> found
>>>>>>> some coding style problems when apply your patch:
>>>>>>>
>>>>>>> Applying looped chain - Break loop in fsck
>>>>>>> .dotest/patch:57: trailing whitespace.
>>>>>>> ret = ocfs2_write_group_desc(ost->ost_fs,
>>>>>>> .dotest/patch:90: trailing whitespace.
>>>>>>> ". Break the loop?",cs->cs_chain_no,
>>>>>>> .dotest/patch:94: trailing whitespace.
>>>>>>> }
>>>>>>> .dotest/patch:115: trailing whitespace.
>>>>>>> verbosef("cl cpg %u bpc %u count %u next %u gds %u max_ch_len %u\n",
>>>>>>> warning: 4 lines add whitespace errors.
>>>>>>>
>>>>>>>
>>>>>>> As for the coding style of ocfs2 src, please refer to:
>>>>>>>
>>>>>>> http://www.kernel.org/doc/Documentation/CodingStyle
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> I understand. will take care in the future.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Goldwyn Rodrigues wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Detects a loop by checking hops against the theoretical limit of
>>>>>>>> number of chains in a chain_rec. If a loop is found, it breaks it by
>>>>>>>> storing the block numbers, and comparing with existing block
>>>>>>>> numbers.
>>>>>>>>
>>>>>>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn at suse.de>
>>>>>>>> Acked-by: Mark Fasheh <mfasheh at suse.com>
>>>>>>>>
>>>>>>>> --- diff --git a/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>> b/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>> index 05561ae..798a1e4 100644
>>>>>>>> --- a/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>> +++ b/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>> @@ -202,6 +202,15 @@ valid in its bitmap.
>>>>>>>>  Answering yes decreases the number of recorded free bits so that it
>>>>>>>> equals
>>>>>>>>  the total number of bits in the group descriptor's bitmap.
>>>>>>>>  +.SS "CHAIN_LOOP"
>>>>>>>> +A chain may loop if the next field of the group descriptor points
>>>>>>>> to
>>>>>>>> one
>>>>>>>> of
>>>>>>>> +the previous group descriptors in the chain. This causes the ocfs2
>>>>>>>> code,
>>>>>>>> both
>>>>>>>> +user space and kernel module to loop forever.
>>>>>>>> +
>>>>>>>> +Answering yes breaks the loop at an optimum location so that all
>>>>>>>> the
>>>>>>>> existing
>>>>>>>> +group descriptors are in the chain. However, it cannot re-connect
>>>>>>>> stray
>>>>>>>> group
>>>>>>>> +descriptors and must rely on the rest of the fsck code to fix it.
>>>>>>>> +
>>>>>>>>  .SS "CHAIN_COUNT"
>>>>>>>>  The chain list embedded in an inode is limited by the block size
>>>>>>>> and
>>>>>>>> the
>>>>>>>>  number of bytes consumed by the rest of the inode.  A chain list
>>>>>>>> header
>>>>>>>> was
>>>>>>>> diff --git a/fsck.ocfs2/pass0.c b/fsck.ocfs2/pass0.c
>>>>>>>> index a32bd18..1b59cc1 100644
>>>>>>>> --- a/fsck.ocfs2/pass0.c
>>>>>>>> +++ b/fsck.ocfs2/pass0.c
>>>>>>>> @@ -455,6 +455,46 @@ out:
>>>>>>>>     return ret;
>>>>>>>>  }
>>>>>>>>  +static errcode_t break_loop(o2fsck_state *ost, struct
>>>>>>>> ocfs2_chain_rec
>>>>>>>> *chain,
>>>>>>>> +               unsigned int max_depth)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> I guess it will make more sense if your rename the func name as
>>>>>>> break_chain_loop() :)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> +{
>>>>>>>> +       uint64_t *list;
>>>>>>>> +       int i;
>>>>>>>> +       unsigned int depth = 0;
>>>>>>>> +       uint64_t blkno = chain->c_blkno;
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> blkno = le64_to_cpu(chain->c_blkno);
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> +       char *buf;
>>>>>>>> +       struct ocfs2_group_desc *gd;
>>>>>>>> +       errcode_t ret = ocfs2_malloc0(sizeof(uint64_t) * max_depth,
>>>>>>>> &list);
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> I recognized all what you do here is to locate the index where loop
>>>>>>> happens,
>>>>>>> and break the loop by terminating the 'next' to NULL. I loved the way
>>>>>>> your
>>>>>>> treating the loop chain, but I don't like the method you locate a the
>>>>>>> position where loops happening.
>>>>>>>
>>>>>>>
>>>>>>> You see, you used an extra array to record 'depth' of all previous
>>>>>>> nodes,
>>>>>>> what's more, you also have to come back for checking each previous
>>>>>>> nodes
>>>>>>> each step, that means your space complexity is O(N), and time
>>>>>>> complexity
>>>>>>> here is O(N2), that's not so pleasant to me:)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> +       if (ret)
>>>>>>>> +               goto out;
>>>>>>>> +       ret =  ocfs2_malloc_block(ost->ost_fs->fs_io, &buf);
>>>>>>>> +       if (ret)
>>>>>>>> +               goto out;
>>>>>>>> +       gd = (struct ocfs2_group_desc *)buf;
>>>>>>>> +
>>>>>>>> +       while (blkno && (depth<=max_depth)) {
>>>>>>>> +               list[depth++] = blkno;
>>>>>>>> +               ret = ocfs2_read_group_desc(ost->ost_fs, blkno,
>>>>>>>> buf);
>>>>>>>> +               if (ret)
>>>>>>>> +                       goto out;
>>>>>>>> +               blkno = gd->bg_next_group;
>>>>>>>> +               for (i=0; i<depth; i++)
>>>>>>>> +                       if (list[i]==blkno) {
>>>>>>>> +                               gd->bg_next_group = 0;
>>>>>>>> +                               verbosef("Breaking gd loop
>>>>>>>> %"PRIu64"\n",
>>>>>>>> blkno);
>>>>>>>> +                               ret =
>>>>>>>> ocfs2_write_group_desc(ost->ost_fs,
>>>>>>>> +                                             blkno, buf);
>>>>>>>> +                               goto out;
>>>>>>>> +                       }
>>>>>>>> +       }
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> To detect and locate a loop in a list, you may have a better
>>>>>>> solution,
>>>>>>> use 2
>>>>>>> pointers, one is 'slow', and the other is 'fast', 'slow' pointer
>>>>>>> proceed
>>>>>>> the
>>>>>>> list by one hop, and 'fast' one will go two hops each step, so the
>>>>>>> essence
>>>>>>> of this alogrithm is, the fast pointer will meet slow pointer somehow
>>>>>>> if
>>>>>>> there is a loop exists, otherwise, fast and slow never meet, and end
>>>>>>> up
>>>>>>> with
>>>>>>> NULLs respectively. the code skeleton will be as follows:
>>>>>>>
>>>>>>> static unit64_t get_group_desc_next(o2fsck_state *ost, uint64_t
>>>>>>> blkno,
>>>>>>> char
>>>>>>> *buf)
>>>>>>> {
>>>>>>> uint64_t next = 0;
>>>>>>> struct ocfs2_group_desc *gd;
>>>>>>>
>>>>>>> ret = ocfs2_read_group_desc(ost->ost_fs, blkno, buf);
>>>>>>> if (ret)
>>>>>>> return next;
>>>>>>>
>>>>>>> gd = (struct ocfs2_group_desc *)buf;
>>>>>>> next = gd->gb_next_group;
>>>>>>>
>>>>>>> return next;
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> in function break_chain_loop():
>>>>>>>
>>>>>>> uint64_t loop_blkno, blkno, slow, fast;
>>>>>>>
>>>>>>> blkno = slow = fast = le64_to_cpu(chain->c_blkno);
>>>>>>>
>>>>>>> while (fast && get_group_desc_next(ost, fast, buf)) {
>>>>>>>
>>>>>>>     /*
>>>>>>>      * fast ptr meet the slow means a loop happening there
>>>>>>>      */
>>>>>>>     if ((fast == slow) || (get_group_desc_next(ost, fast, buf) ==
>>>>>>> slow))
>>>>>>> {
>>>>>>>             /*
>>>>>>>              * Loop happened at the node before fast and slow
>>>>>>>              */
>>>>>>>             if (fast == slow)
>>>>>>>                     loop_blkno = blkno;
>>>>>>>
>>>>>>>             /*
>>>>>>>              * Loop happend at the node of fast
>>>>>>>              */
>>>>>>>             if (get_group_desc_next(ost, fast, buf) == slow)
>>>>>>>                     loop_blkno = fast;
>>>>>>>
>>>>>>>             ret = ocfs2_read_group_desc(ost->ost_fs, loop_blkno,
>>>>>>> buf);
>>>>>>>             if (ret)
>>>>>>>                     goto bail;
>>>>>>>
>>>>>>>             gd = (struct ocfs2_group_desc *)buf;
>>>>>>>
>>>>>>>             gd->bg_next_group = 0;
>>>>>>>
>>>>>>>             verbose("Breaking gd loop %"PRIu64"\n", loop_blkno);
>>>>>>>
>>>>>>>             ret = ocfs2_write_group_desc(ost->ost_fs, loop_blkno,
>>>>>>> buf);
>>>>>>>             if (ret)
>>>>>>>                     goto bail;
>>>>>>>
>>>>>>>             break;
>>>>>>>     } else {
>>>>>>>             blkno = slow;
>>>>>>>
>>>>>>>             /*
>>>>>>>              * Slow ptr goes one by one hop each time.
>>>>>>>              */
>>>>>>>             slow = get_group_desc_next(ost, slow, buf);
>>>>>>>
>>>>>>>             /*
>>>>>>>              * Fast ptr proceeds by 2 hops each step.
>>>>>>>              */
>>>>>>>             fast = get_group_desc_next(ost, fast, buf);
>>>>>>>             fast = get_group_desc_next(ost, fast, buf);
>>>>>>>     }
>>>>>>>
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>
>>>>>>> According to above algorithm, we easily found that the space
>>>>>>> complexity
>>>>>>> is
>>>>>>> 0, and time complexity was declined to O(N).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> While this algorithm is efficient in diagnosing the loop, I think it
>>>>>> fails in  detecting the optimum place to break the loop.
>>>>>>
>>>>>> Could you run this algorithm with the following scenario:
>>>>>>
>>>>>>
>>>>>> cr->c_blkno = 1
>>>>>>
>>>>>> gd(1)->next=2
>>>>>> gd(2)->next=3
>>>>>> gd(3)->next=4
>>>>>> gd(4)->next=2
>>>>>>
>>>>>> Where would the breakage be in this case?
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Oh, you're right, the current algorithm is not capable of detecting the
>>>>> first position where a pointing-back happens, it needs to be extented:
>>>>>
>>>>>
>>>>> int looped = 0;
>>>>> uint64_t blkno, slow, fast;
>>>>>
>>>>> slow = fast = chain->c_blkno;
>>>>>
>>>>> while (fast && get_group_desc_next(ost, fast, buf)) {
>>>>>      /*
>>>>>       * fast ptr meet the slow means a loop happening there
>>>>>       */
>>>>>      if (fast == slow) {
>>>>>              looped = 1;
>>>>>              break;
>>>>>      } else {
>>>>>
>>>>>              /*
>>>>>               * Slow ptr goes one by one hop each time.
>>>>>               */
>>>>>              slow = get_group_desc_next(ost, slow, buf);
>>>>>
>>>>>              /*
>>>>>               * Fast ptr proceeds by 2 hops each step.
>>>>>               */
>>>>>              fast = get_group_desc_next(ost, fast, buf);
>>>>>              fast = get_group_desc_next(ost, fast, buf);
>>>>>      }
>>>>> }
>>>>>
>>>>> if (looped) {
>>>>>  blkno = chain->c_blkno;
>>>>>  /*
>>>>>  * Find out the optimum to break the loop
>>>>>  */
>>>>>  while (blkno != slow) {
>>>>>    blkno = get_group_desc_next(ost, blkno, buf);
>>>>>    slow = get_group_desc_next(ost, blkno, buf);
>>>>>  }
>>>>>
>>>>>
>>>>>
>>>>
>>>> You are creating one infinite loop to break another ;)
>>>> If blkno!=slow before the loop, this loop will never finish, except
>>>> the case of a 1-node loop.
>>>> Did you run it through the case I presented yesterday?
>>>>
>>>>
>>>
>>> Oh, I'm afraid it would work, that's a well-known(for its space and time
>>> efficiency) algorithm to solve the loop in a singly linked list, it does
>>> terminate the 'while' when 'blkno == slow', and that's also the optimal
>>> node
>>> where the breakage was expected. for its mathematics principle, you can
>>> Google for help;)
>>>
>>>
>>
>>
>> I am not doubting the space or time efficiency, but the correctness of
>> the algorithm.
>> The slow pointer will always walk *exactly one step* ahead of blkno,
>> and hence in a loop > 2 nodes, it will never be equal to blkno, and
>> hence will never end.
>> This is because you are calculating slow with respect to blkno.
>>
>> Let me know if you want me to run you through an example.
>>
>
> Hi Goldwyn,
>
> I've run your case mentioned yesterday, even with more corner cases, It
> works well by now:)
>
> I assumed that you have no doubt about the method to detect if a loop exists
> in a list by using fast and slow iterators(they will eventually meet
> sometime if loop exists). as for the algorithm to find out a optimal node
> for breakage, we may need some mathematical proof, take the following case
> for example:
>
> 0->1->2->3->4->5->6->7->8->(3), now we want to find out the first entry
> point of the loop, it's 3 in that case! here we call the desired position as
> A, assume the distance between start to A is x,
>
> Assume that the length of loop is y.
>
> According to fast and slow solution, we're sure that 2 pointers will finally
> meet in the loop somewhere, at that right moment, we call the meeting
> position as B, assume the distance between A(first entry of loop) to
> B(meeting place) is z.
>
> Obviously, slow pointer has walked (x+z) steps as they firstly meet, while
> fast pointer has walked 2*(x+z),  and it's also easy to get 2*(x+z) - (x+z)
> = K * y, here K is an integer, means fast pointer walked k*loop_length more
> steps than slow pointer.
>
> Given the fact that fast and slow have already meet there, Now we continue
> to let slow pointer proceed by one step each time, hopefully, it will get A
> after continuously walking x steps, that's really tricky:), since (x+z) is
> multiple of y. In the meantime, if we reset the fast pointer to the
> beginning of list, and let it walk also one step each time, it will also get
> A after walking x steps.
>
> As a result, above mathematical analysis indicated that, the fast and slow
> will meet again if we reset the fast pointer to the start of list and tune
> its pace as one step as slow pointer, and fortunately, the second meeting
> point is the right node we're looking for as a optimal node to break.
>
> It will greatly help if you draw all these in the paper for better
> understanding:)

Now you are trying to cover up your mistake ;)

I fully understand the algorithm and the algorithm posted on the
various internet sites are correct. For that matter, if you know the
loop is large enough, using relatively prime number of hops will help
you achieve the "meeting point" faster.

It is *your implementation* of the algorithm I am not willing to agree with.

>
>
> The code skeleton works for me now as follows:
>
> int detect_loop(struct link_node *head, int breakage)
> {
>       struct link_node *fast, *slow, *prev;
>       int looped = 0;
>
>       prev = fast = slow = head;
>
>       while (1) {
>               if (fast && fast->next)
>                       fast = fast->next->next;
>               else
>                       return looped;
>
>               prev = slow;
>               slow = slow->next;
>
>               if (fast == slow) {
>                       looped = 1;
>                       break;
>               }
>       }
>
>       if (looped && breakage) {
>
>               fast = head;
>               while (fast != slow) {
>                       prev = slow;
>                       slow = slow->next;
>                       fast = fast->next;
>               }

You have changed (corrected) this! In analogy  to your previous
posting, fast=slow->next, which is causing the infinite loop.

For your convenince, I will paste it again:

>>>>>  while (blkno != slow) {
>>>>>    blkno = get_group_desc_next(ost, blkno, buf);
>>>>>    slow = get_group_desc_next(ost, blkno, buf);
>>>>>  }

I would appreciate if you would post the correct implementation for
the good audit.
I understand what you are trying to do and I agree the algorithm is
much better with respect to time and space efficiency. However, if it
does not run correctly, it is not acceptable.

Please post the final patch with respect to ocfs2-tools.

-- 
Goldwyn