[Ocfs2-tools-devel] [PATCH 2/2] looped chain - Break loop in fsck
tristan ye
tristan.ye at oracle.com
Fri Mar 5 06:44:26 PST 2010
Goldwyn Rodrigues wrote:
> Hi Tristan,
>
>
> On Thu, Mar 4, 2010 at 11:19 PM, tristan <tristan.ye at oracle.com> wrote:
>
>> Goldwyn Rodrigues wrote:
>>
>>> Hi Tristan
>>>
>>> On Thu, Mar 4, 2010 at 7:28 PM, tristan <tristan.ye at oracle.com> wrote:
>>>
>>>
>>>> Goldwyn Rodrigues wrote:
>>>>
>>>>
>>>>> Hi Tristan,
>>>>>
>>>>> On Wed, Mar 3, 2010 at 8:45 PM, tristan <tristan.ye at oracle.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Goldwyn Rodrigues wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hi Tristan,
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 2, 2010 at 9:13 PM, tristan <tristan.ye at oracle.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hi Goldwyn,
>>>>>>>>
>>>>>>>> Comments are inlined:
>>>>>>>>
>>>>>>>> Firstly, please check the patch by /kernel-src/script/checkpatch.pl
>>>>>>>> your-patch to detect the coding issues at the very beginning, since I
>>>>>>>> found
>>>>>>>> some coding style problems when apply your patch:
>>>>>>>>
>>>>>>>> Applying looped chain - Break loop in fsck
>>>>>>>> .dotest/patch:57: trailing whitespace.
>>>>>>>> ret = ocfs2_write_group_desc(ost->ost_fs,
>>>>>>>> .dotest/patch:90: trailing whitespace.
>>>>>>>> ". Break the loop?",cs->cs_chain_no,
>>>>>>>> .dotest/patch:94: trailing whitespace.
>>>>>>>> }
>>>>>>>> .dotest/patch:115: trailing whitespace.
>>>>>>>> verbosef("cl cpg %u bpc %u count %u next %u gds %u max_ch_len %u\n",
>>>>>>>> warning: 4 lines add whitespace errors.
>>>>>>>>
>>>>>>>>
>>>>>>>> As for the coding style of ocfs2 src, please refer to:
>>>>>>>>
>>>>>>>> http://www.kernel.org/doc/Documentation/CodingStyle
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> I understand. will take care in the future.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Goldwyn Rodrigues wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Detects a loop by checking hops against the theoretical limit of
>>>>>>>>> number of chains in a chain_rec. If a loop is found, it breaks it by
>>>>>>>>> storing the block numbers, and comparing with existing block
>>>>>>>>> numbers.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn at suse.de>
>>>>>>>>> Acked-by: Mark Fasheh <mfasheh at suse.com>
>>>>>>>>>
>>>>>>>>> --- diff --git a/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>>> b/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>>> index 05561ae..798a1e4 100644
>>>>>>>>> --- a/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>>> +++ b/fsck.ocfs2/fsck.ocfs2.checks.8.in
>>>>>>>>> @@ -202,6 +202,15 @@ valid in its bitmap.
>>>>>>>>> Answering yes decreases the number of recorded free bits so that it
>>>>>>>>> equals
>>>>>>>>> the total number of bits in the group descriptor's bitmap.
>>>>>>>>> +.SS "CHAIN_LOOP"
>>>>>>>>> +A chain may loop if the next field of the group descriptor points
>>>>>>>>> to
>>>>>>>>> one
>>>>>>>>> of
>>>>>>>>> +the previous group descriptors in the chain. This causes the ocfs2
>>>>>>>>> code,
>>>>>>>>> both
>>>>>>>>> +user space and kernel module to loop forever.
>>>>>>>>> +
>>>>>>>>> +Answering yes breaks the loop at an optimum location so that all
>>>>>>>>> the
>>>>>>>>> existing
>>>>>>>>> +group descriptors are in the chain. However, it cannot re-connect
>>>>>>>>> stray
>>>>>>>>> group
>>>>>>>>> +descriptors and must rely on the rest of the fsck code to fix it.
>>>>>>>>> +
>>>>>>>>> .SS "CHAIN_COUNT"
>>>>>>>>> The chain list embedded in an inode is limited by the block size
>>>>>>>>> and
>>>>>>>>> the
>>>>>>>>> number of bytes consumed by the rest of the inode. A chain list
>>>>>>>>> header
>>>>>>>>> was
>>>>>>>>> diff --git a/fsck.ocfs2/pass0.c b/fsck.ocfs2/pass0.c
>>>>>>>>> index a32bd18..1b59cc1 100644
>>>>>>>>> --- a/fsck.ocfs2/pass0.c
>>>>>>>>> +++ b/fsck.ocfs2/pass0.c
>>>>>>>>> @@ -455,6 +455,46 @@ out:
>>>>>>>>> return ret;
>>>>>>>>> }
>>>>>>>>> +static errcode_t break_loop(o2fsck_state *ost, struct
>>>>>>>>> ocfs2_chain_rec
>>>>>>>>> *chain,
>>>>>>>>> + unsigned int max_depth)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I guess it will make more sense if your rename the func name as
>>>>>>>> break_chain_loop() :)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> +{
>>>>>>>>> + uint64_t *list;
>>>>>>>>> + int i;
>>>>>>>>> + unsigned int depth = 0;
>>>>>>>>> + uint64_t blkno = chain->c_blkno;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> blkno = le64_to_cpu(chain->c_blkno);
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> + char *buf;
>>>>>>>>> + struct ocfs2_group_desc *gd;
>>>>>>>>> + errcode_t ret = ocfs2_malloc0(sizeof(uint64_t) * max_depth,
>>>>>>>>> &list);
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I recognized all what you do here is to locate the index where loop
>>>>>>>> happens,
>>>>>>>> and break the loop by terminating the 'next' to NULL. I loved the way
>>>>>>>> your
>>>>>>>> treating the loop chain, but I don't like the method you locate a the
>>>>>>>> position where loops happening.
>>>>>>>>
>>>>>>>>
>>>>>>>> You see, you used an extra array to record 'depth' of all previous
>>>>>>>> nodes,
>>>>>>>> what's more, you also have to come back for checking each previous
>>>>>>>> nodes
>>>>>>>> each step, that means your space complexity is O(N), and time
>>>>>>>> complexity
>>>>>>>> here is O(N2), that's not so pleasant to me:)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> + if (ret)
>>>>>>>>> + goto out;
>>>>>>>>> + ret = ocfs2_malloc_block(ost->ost_fs->fs_io, &buf);
>>>>>>>>> + if (ret)
>>>>>>>>> + goto out;
>>>>>>>>> + gd = (struct ocfs2_group_desc *)buf;
>>>>>>>>> +
>>>>>>>>> + while (blkno && (depth<=max_depth)) {
>>>>>>>>> + list[depth++] = blkno;
>>>>>>>>> + ret = ocfs2_read_group_desc(ost->ost_fs, blkno,
>>>>>>>>> buf);
>>>>>>>>> + if (ret)
>>>>>>>>> + goto out;
>>>>>>>>> + blkno = gd->bg_next_group;
>>>>>>>>> + for (i=0; i<depth; i++)
>>>>>>>>> + if (list[i]==blkno) {
>>>>>>>>> + gd->bg_next_group = 0;
>>>>>>>>> + verbosef("Breaking gd loop
>>>>>>>>> %"PRIu64"\n",
>>>>>>>>> blkno);
>>>>>>>>> + ret =
>>>>>>>>> ocfs2_write_group_desc(ost->ost_fs,
>>>>>>>>> + blkno, buf);
>>>>>>>>> + goto out;
>>>>>>>>> + }
>>>>>>>>> + }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> To detect and locate a loop in a list, you may have a better
>>>>>>>> solution,
>>>>>>>> use 2
>>>>>>>> pointers, one is 'slow', and the other is 'fast', 'slow' pointer
>>>>>>>> proceed
>>>>>>>> the
>>>>>>>> list by one hop, and 'fast' one will go two hops each step, so the
>>>>>>>> essence
>>>>>>>> of this alogrithm is, the fast pointer will meet slow pointer somehow
>>>>>>>> if
>>>>>>>> there is a loop exists, otherwise, fast and slow never meet, and end
>>>>>>>> up
>>>>>>>> with
>>>>>>>> NULLs respectively. the code skeleton will be as follows:
>>>>>>>>
>>>>>>>> static unit64_t get_group_desc_next(o2fsck_state *ost, uint64_t
>>>>>>>> blkno,
>>>>>>>> char
>>>>>>>> *buf)
>>>>>>>> {
>>>>>>>> uint64_t next = 0;
>>>>>>>> struct ocfs2_group_desc *gd;
>>>>>>>>
>>>>>>>> ret = ocfs2_read_group_desc(ost->ost_fs, blkno, buf);
>>>>>>>> if (ret)
>>>>>>>> return next;
>>>>>>>>
>>>>>>>> gd = (struct ocfs2_group_desc *)buf;
>>>>>>>> next = gd->gb_next_group;
>>>>>>>>
>>>>>>>> return next;
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> in function break_chain_loop():
>>>>>>>>
>>>>>>>> uint64_t loop_blkno, blkno, slow, fast;
>>>>>>>>
>>>>>>>> blkno = slow = fast = le64_to_cpu(chain->c_blkno);
>>>>>>>>
>>>>>>>> while (fast && get_group_desc_next(ost, fast, buf)) {
>>>>>>>>
>>>>>>>> /*
>>>>>>>> * fast ptr meet the slow means a loop happening there
>>>>>>>> */
>>>>>>>> if ((fast == slow) || (get_group_desc_next(ost, fast, buf) ==
>>>>>>>> slow))
>>>>>>>> {
>>>>>>>> /*
>>>>>>>> * Loop happened at the node before fast and slow
>>>>>>>> */
>>>>>>>> if (fast == slow)
>>>>>>>> loop_blkno = blkno;
>>>>>>>>
>>>>>>>> /*
>>>>>>>> * Loop happend at the node of fast
>>>>>>>> */
>>>>>>>> if (get_group_desc_next(ost, fast, buf) == slow)
>>>>>>>> loop_blkno = fast;
>>>>>>>>
>>>>>>>> ret = ocfs2_read_group_desc(ost->ost_fs, loop_blkno,
>>>>>>>> buf);
>>>>>>>> if (ret)
>>>>>>>> goto bail;
>>>>>>>>
>>>>>>>> gd = (struct ocfs2_group_desc *)buf;
>>>>>>>>
>>>>>>>> gd->bg_next_group = 0;
>>>>>>>>
>>>>>>>> verbose("Breaking gd loop %"PRIu64"\n", loop_blkno);
>>>>>>>>
>>>>>>>> ret = ocfs2_write_group_desc(ost->ost_fs, loop_blkno,
>>>>>>>> buf);
>>>>>>>> if (ret)
>>>>>>>> goto bail;
>>>>>>>>
>>>>>>>> break;
>>>>>>>> } else {
>>>>>>>> blkno = slow;
>>>>>>>>
>>>>>>>> /*
>>>>>>>> * Slow ptr goes one by one hop each time.
>>>>>>>> */
>>>>>>>> slow = get_group_desc_next(ost, slow, buf);
>>>>>>>>
>>>>>>>> /*
>>>>>>>> * Fast ptr proceeds by 2 hops each step.
>>>>>>>> */
>>>>>>>> fast = get_group_desc_next(ost, fast, buf);
>>>>>>>> fast = get_group_desc_next(ost, fast, buf);
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>>
>>>>>>>> According to above algorithm, we easily found that the space
>>>>>>>> complexity
>>>>>>>> is
>>>>>>>> 0, and time complexity was declined to O(N).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> While this algorithm is efficient in diagnosing the loop, I think it
>>>>>>> fails in detecting the optimum place to break the loop.
>>>>>>>
>>>>>>> Could you run this algorithm with the following scenario:
>>>>>>>
>>>>>>>
>>>>>>> cr->c_blkno = 1
>>>>>>>
>>>>>>> gd(1)->next=2
>>>>>>> gd(2)->next=3
>>>>>>> gd(3)->next=4
>>>>>>> gd(4)->next=2
>>>>>>>
>>>>>>> Where would the breakage be in this case?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Oh, you're right, the current algorithm is not capable of detecting the
>>>>>> first position where a pointing-back happens, it needs to be extented:
>>>>>>
>>>>>>
>>>>>> int looped = 0;
>>>>>> uint64_t blkno, slow, fast;
>>>>>>
>>>>>> slow = fast = chain->c_blkno;
>>>>>>
>>>>>> while (fast && get_group_desc_next(ost, fast, buf)) {
>>>>>> /*
>>>>>> * fast ptr meet the slow means a loop happening there
>>>>>> */
>>>>>> if (fast == slow) {
>>>>>> looped = 1;
>>>>>> break;
>>>>>> } else {
>>>>>>
>>>>>> /*
>>>>>> * Slow ptr goes one by one hop each time.
>>>>>> */
>>>>>> slow = get_group_desc_next(ost, slow, buf);
>>>>>>
>>>>>> /*
>>>>>> * Fast ptr proceeds by 2 hops each step.
>>>>>> */
>>>>>> fast = get_group_desc_next(ost, fast, buf);
>>>>>> fast = get_group_desc_next(ost, fast, buf);
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> if (looped) {
>>>>>> blkno = chain->c_blkno;
>>>>>> /*
>>>>>> * Find out the optimum to break the loop
>>>>>> */
>>>>>> while (blkno != slow) {
>>>>>> blkno = get_group_desc_next(ost, blkno, buf);
>>>>>> slow = get_group_desc_next(ost, blkno, buf);
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> You are creating one infinite loop to break another ;)
>>>>> If blkno!=slow before the loop, this loop will never finish, except
>>>>> the case of a 1-node loop.
>>>>> Did you run it through the case I presented yesterday?
>>>>>
>>>>>
>>>>>
>>>> Oh, I'm afraid it would work, that's a well-known(for its space and time
>>>> efficiency) algorithm to solve the loop in a singly linked list, it does
>>>> terminate the 'while' when 'blkno == slow', and that's also the optimal
>>>> node
>>>> where the breakage was expected. for its mathematics principle, you can
>>>> Google for help;)
>>>>
>>>>
>>>>
>>> I am not doubting the space or time efficiency, but the correctness of
>>> the algorithm.
>>> The slow pointer will always walk *exactly one step* ahead of blkno,
>>> and hence in a loop > 2 nodes, it will never be equal to blkno, and
>>> hence will never end.
>>> This is because you are calculating slow with respect to blkno.
>>>
>>> Let me know if you want me to run you through an example.
>>>
>>>
>> Hi Goldwyn,
>>
>> I've run your case mentioned yesterday, even with more corner cases, It
>> works well by now:)
>>
>> I assumed that you have no doubt about the method to detect if a loop exists
>> in a list by using fast and slow iterators(they will eventually meet
>> sometime if loop exists). as for the algorithm to find out a optimal node
>> for breakage, we may need some mathematical proof, take the following case
>> for example:
>>
>> 0->1->2->3->4->5->6->7->8->(3), now we want to find out the first entry
>> point of the loop, it's 3 in that case! here we call the desired position as
>> A, assume the distance between start to A is x,
>>
>> Assume that the length of loop is y.
>>
>> According to fast and slow solution, we're sure that 2 pointers will finally
>> meet in the loop somewhere, at that right moment, we call the meeting
>> position as B, assume the distance between A(first entry of loop) to
>> B(meeting place) is z.
>>
>> Obviously, slow pointer has walked (x+z) steps as they firstly meet, while
>> fast pointer has walked 2*(x+z), and it's also easy to get 2*(x+z) - (x+z)
>> = K * y, here K is an integer, means fast pointer walked k*loop_length more
>> steps than slow pointer.
>>
>> Given the fact that fast and slow have already meet there, Now we continue
>> to let slow pointer proceed by one step each time, hopefully, it will get A
>> after continuously walking x steps, that's really tricky:), since (x+z) is
>> multiple of y. In the meantime, if we reset the fast pointer to the
>> beginning of list, and let it walk also one step each time, it will also get
>> A after walking x steps.
>>
>> As a result, above mathematical analysis indicated that, the fast and slow
>> will meet again if we reset the fast pointer to the start of list and tune
>> its pace as one step as slow pointer, and fortunately, the second meeting
>> point is the right node we're looking for as a optimal node to break.
>>
>> It will greatly help if you draw all these in the paper for better
>> understanding:)
>>
>
>
> Now you are trying to cover up your mistake ;)
>
Yes, the previous codes has not been tested well, it's nothing realistic
but a concept I guess. while the latest one I pasted last time is
workable under testing, is there still any mistakes?
> I fully understand the algorithm and the algorithm posted on the
> various internet sites are correct. For that matter, if you know the
> loop is large enough, using relatively prime number of hops will help
> you achieve the "meeting point" faster.
>
> It is *your implementation* of the algorithm I am not willing to agree with.
>
Hi Goldwyn,
Thanks for your patient follow-up on this issue:)
I admit that previous code skeleton pasted by has mistakes, it's just a
rough solution to illustrate my general idea against the algorithm. and
I've not tested it in a real env. However, the last code I pasted was
feasible under testing by your cases. Do you think it's still a
unrealistic one?
Let me know if you have some other suggestions on it.
>
>
>> The code skeleton works for me now as follows:
>>
>> int detect_loop(struct link_node *head, int breakage)
>> {
>> struct link_node *fast, *slow, *prev;
>> int looped = 0;
>>
>> prev = fast = slow = head;
>>
>> while (1) {
>> if (fast && fast->next)
>> fast = fast->next->next;
>> else
>> return looped;
>>
>> prev = slow;
>> slow = slow->next;
>>
>> if (fast == slow) {
>> looped = 1;
>> break;
>> }
>> }
>>
>> if (looped && breakage) {
>>
>> fast = head;
>> while (fast != slow) {
>> prev = slow;
>> slow = slow->next;
>> fast = fast->next;
>> }
>>
>
> You have changed (corrected) this! In analogy to your previous
> posting, fast=slow->next, which is causing the infinite loop.
>
> For your convenince, I will paste it again:
>
>
>>>>>> while (blkno != slow) {
>>>>>> blkno = get_group_desc_next(ost, blkno, buf);
>>>>>> slow = get_group_desc_next(ost, blkno, buf);
>>>>>> }
>>>>>>
>
>
> I would appreciate if you would post the correct implementation for
> the good audit.
> I understand what you are trying to do and I agree the algorithm is
> much better with respect to time and space efficiency. However, if it
> does not run correctly, it is not acceptable.
>
> Please post the final patch with respect to ocfs2-tools.
>
I can post the final patch regarding ocfs2-tools if you have no
objection against my above algorithm which is feasible in a experimental
env.
>
>
More information about the Ocfs2-tools-devel
mailing list