[Ocfs2-devel] orphan cleanup
Joel Becker
Joel.Becker at oracle.com
Thu Apr 30 11:53:46 PDT 2009
Srini,
Ok, you can go ahead and cook up the background orphan cleaner.
Now, we can do this in a workqueue, a thread, or a timer. I don't see
why a timer doesn't work. When the timer fires, you do this:
1. Take EX on a new orphan_scan lock.
2. check the LVB for the last scan time. If it's less than the scan
timeout, reset the timer for (timeout - last scan), drop the EX, and
exit.
3. Call ocfs2_queue_recovery_completion() for all slots with NULL, NULL,
NULL on the non-orphan-dir arguments. This sets up the orphan
recovery.
4. Update the LVB with the current scan time.
5. Drop the EX to an NL.
6. Reset the timer for the scan timeout.
Points about this scheme:
- Doesn't need a process.
- Don't need to change the locking protocol version, as older versions
just ignore this problem.
- Ensures only one node runs the scan each timeout period.
- Uses our existing orphan recovery code unchanged.
- We don't need to keep a PR on the orphan scan lock. It's just extra
network traffic and downconvert processing we don't care about.
Better to wake up once when our timeout fires than to wake up every
time another node goes to make a scan.
- I realize that I've updated the scan time at the queue of the scan,
not at the completion. It doesn't really make much of a difference
with many-minute scan periods, and it is a lot simpler than trying to
add code to wait on all the orphans.
Joel
--
Life's Little Instruction Book #232
"Keep your promises."
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
More information about the Ocfs2-devel
mailing list