[Ocfs2-devel] orphan cleanup

Sunil Mushran sunil.mushran at oracle.com
Thu Apr 30 12:22:42 PDT 2009


Joel Becker wrote:
> Srini,
> 	Ok, you can go ahead and cook up the background orphan cleaner.
> Now, we can do this in a workqueue, a thread, or a timer.  I don't see
> why a timer doesn't work.  When the timer fires, you do this:
>
> 1. Take EX on a new orphan_scan lock.
> 2. check the LVB for the last scan time.  If it's less than the scan
>    timeout, reset the timer for (timeout - last scan), drop the EX, and
>    exit.

We should add a random value to the timeout. Else the master will end up
"winning" the task every time.

> 3. Call ocfs2_queue_recovery_completion() for all slots with NULL, NULL,
>    NULL on the non-orphan-dir arguments.  This sets up the orphan
>    recovery.
> 4. Update the LVB with the current scan time.
> 5. Drop the EX to an NL.
> 6. Reset the timer for the scan timeout.
>
> 	Points about this scheme:
>
> - Doesn't need a process.
> - Don't need to change the locking protocol version, as older versions
>   just ignore this problem.
> - Ensures only one node runs the scan each timeout period.
> - Uses our existing orphan recovery code unchanged.
> - We don't need to keep a PR on the orphan scan lock.  It's just extra
>   network traffic and downconvert processing we don't care about.
>   Better to wake up once when our timeout fires than to wake up every
>   time another node goes to make a scan.
> - I realize that I've updated the scan time at the queue of the scan,
>   not at the completion.  It doesn't really make much of a difference
>   with many-minute scan periods, and it is a lot simpler than trying to
>   add code to wait on all the orphans.
>
> Joel

Looks good.



More information about the Ocfs2-devel mailing list