[Btrfs-devel] btrfs and git-reflog

Theodore Tso tytso at MIT.EDU
Fri Jan 25 13:52:06 PST 2008

On Fri, Jan 25, 2008 at 04:37:25PM -0500, Chris Mason wrote:
> Hmmm, for seekdir and telldir, my understanding was that I was allowed to 
> return entries more than once, especially in the face of changes to the 
> directory in between calls.  

You are, but only for the files that have changed.  That is, if you
create a new directory entry, or remove a directory entry (in the
link() or unlink() sense of the word), it is undefined whether they
will be retuned by readdir() zero, once, or twice, either as you
interate through the readdir(), or if you use telldir/seekdir.

But for entries that were not changed, they must be returned once and
only once.  The problem with just using an inode number as the readdir
off_t, of course, is you can get in trouble even without deleting or
adding new files or hard links.

For example, if a particular inode #12345, is hard linked three times,
as file names "foo", "bar", and "baz", after you have readdir()
returns "foo" and "bar", the application does a telldir()/seekdir()
pair, you're not supposed to return "foo" and "bar" again, but go
straight to "baz".  But if you use the inode number as the telldir
cookie, then you don't know whether to resume the readdir() stream
after "foo", "bar", or "baz".

> http://www.opengroup.org/onlinepubs/000095399/functions/seekdir.html
> Does the JFS cookie allow you to get from the file to to the
> directory entry?  We've got that now via backrefs.

The JFS cookie is basically a stable identifier, whose size fits in a
off_t, which given a directory, will get you back to the directory
entry.  The requirments of a telldir() cookie is that it must be
stable until closedir().  The requirements of the NFSv2 readdir()
cookie is that it must be stable across reboots.  Hence, indexing the
32-bit telldir/NFS cookie in a separate btree which is persistent in
the filesystem meets both the requirements of telldir/seekdir and

And just remember, "NFS: it's all Sun's fault."  :-)

						- Ted

More information about the Btrfs-devel mailing list