[Ocfs2-tools-devel] [PATCH 37/39] ocfs2_controld: Open and close checkpoints

Joel Becker Joel.Becker at oracle.com
Thu May 29 12:06:54 PDT 2008


On Tue, May 27, 2008 at 05:49:52PM -0700, Mark Fasheh wrote:
> On Fri, Mar 14, 2008 at 04:53:00PM -0700, Joel Becker wrote:
> > +		if (write && (rc == -EEXIST))
> > +			log_debug("Checkpoint \"%.*s\" exists, retrying after delay",
> > +				  handle->ch_name.length,
> > +				  handle->ch_name.value);
> 
> Can '(write && (rc == -EEXIST))' happen during 'normal' operations - i.e., there isn't a
> serious error? If so, maybe we shouldn't count those in retrycount so we don't
> error prematurely?

	Only the first member can write.  Think O_CREAT|O_EXCL.  A node
knows whether it is the first member when it gets the CPG join event.
If it is the first member, it opens with write==1, and it MUST be able
to create the checkpoint.  Conversely, a non-first member opens with
write==0 and MUST fine the checkpoint.
	So, if our process decides it is first, tries to create the
checkpoint, and gets EEXIST, that means another process somewhere has
already created it.  Now, if that other process is exiting, our retries
should allow us to eventually create the checkpoint.  But if that other
process never exits, we can't continue.
	The actual decision of write==1 is made in a later commit, so
perhaps that clarifies things.

Joel

-- 

	Pitchers and catchers report.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127



More information about the Ocfs2-tools-devel mailing list