[Ocfs2-tools-commits] branch, master, updated. ocfs2-tools-1.4.0-339-ge940fd5

Thu Jul 23 15:15:35 PDT 2009

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "Tools to manage the ocfs2 filesystem.".

The branch, master has been updated
       via  e940fd59ae9193e918282debc6a2496d08f6c2f6 (commit)
       via  f4d81c5e80fc7b15bdbe6080fa4e0330a0241d71 (commit)
       via  35c59eccc60dfe9068cae3456d04076b6978c2ce (commit)
       via  21277dec029c5453847ebd77315a663eb317e80c (commit)
       via  bba44c86a1903dd702de9610fb4b992ed0c74078 (commit)
       via  5c361531106fc8e6dfc151ed4c3861d562f4b7e5 (commit)
       via  1fa5d9dea32caf99efb4e0811a48655f24938468 (commit)
       via  69223be5af7868605c5d681ad64ccf63838f4858 (commit)
       via  7fd354d5bd63370316088267fb9832800f4c9b53 (commit)
       via  1770929e5dfc1f85c8e3f89a1d7f0ecfbd284839 (commit)
       via  2acd6abfb2befce43890c8b2ebb6d2807ec58b77 (commit)
       via  d4704a8fd16d47cce1bf8994e17bd1330199daed (commit)
       via  c3f629da65c9371bde00d8ff797f575e28e35837 (commit)
       via  74ba73a56717e9c471e50aa10ec63b9754cd1d30 (commit)
       via  d52ba85de26e3c1588ecbc0b391036570e319834 (commit)
       via  a076fec7dddb201bbe638913f42e006c2bbe2c2e (commit)
       via  170ba4ca7ff71a680fafa2a7a4d4adcb920027cf (commit)
       via  af38aa872ddc432688eeb5eaafe918bea65c3584 (commit)
       via  f2e4c143ce8367411c16479a1107367330d39d3b (commit)
      from  4917103dceb6313ee2c9a550e3480b8e7b445aec (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit e940fd59ae9193e918282debc6a2496d08f6c2f6
Author: Joel Becker <joel.becker at oracle.com>
Date:   Mon Jun 8 21:52:48 2009 -0700

    fsck.ocfs2: Use ocfs2_cluster_bitmap_new() for allocated clusters.

    Use a memory bitmap that is based on fs_clusters instead of fs_blocks
    for allocated clusters.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit f4d81c5e80fc7b15bdbe6080fa4e0330a0241d71
Author: Joel Becker <joel.becker at oracle.com>
Date:   Mon Jun 8 19:56:01 2009 -0700

    fsck.ocfs2: Fix the cluster count if we changed it in pass 0.

    At the start of pass 0, we fix the superblock's cluster count to match
    the global bitmap.  After we do that, we actually verify the global
    bitmap.  If we end up repairing the global bitmap in any way, we never
    check to see if its cluster count changed.

    This patch checks to see if the cluster count changed during repair.  If
    so, the user is prompted to trust the new values.  This triggers a
    reinit of the fsck state and a rescan of the global bitmap.  While the
    fixed bitmap should be fine, the rescan is necessary to pick up the
    clusters used.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 35c59eccc60dfe9068cae3456d04076b6978c2ce
Author: Joel Becker <joel.becker at oracle.com>
Date:   Mon Jun 8 18:25:10 2009 -0700

    fsck.ocfs2: Handle errors from ocfs2_bitmap_set/clear()

    o2fsck expects to be able to treat ocfs2_bitmap_set() and
    ocfs2_bitmap_clear() as void functions.  However, since o2fsck uses the
    sparse-memory bitmaps a lot, they can (in theory) have allocation
    failures.  A silent allocation failure looks like clear bits, which is
    not a safe behavior.

    So, let's wrap ocfs2_bitmap_set/clear() with o2fsck_bitmap_set/clear().
    These are true void functions.  If they get an error from
    ocfs2_bitmap_set/clear() they will print the caller, the error, and the
    bit number.  Then they abort o2fsck.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 21277dec029c5453847ebd77315a663eb317e80c
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Jun 12 13:40:06 2009 -0700

    tunefs.ocfs2: Use one I/O cache.

    We nest ocfs2_filesys structures for each operation.  However, this can
    mean that the I/O cache on the master filesys is stale when a child
    filesys has written to its own cache.

    Instead, we'll create an I/O cache on the master filesys and share it in
    the children.  That way all operations have a consistent view of the
    disk.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit bba44c86a1903dd702de9610fb4b992ed0c74078
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Jun 12 13:27:48 2009 -0700

    tunefs.ocfs2: Size the cache appropriately for large operations.

    This introduces the flag TUNEFS_FLAG_LARGECACHE.  Operations that do not
    use this flag get a small I/O cache.  Operations that specify the flag
    get a cache as big as they can allocate, up to the size of the
    filesystem.  This should speed up operations that have to scan the whole
    disk in more than one pass.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 5c361531106fc8e6dfc151ed4c3861d562f4b7e5
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Jun 12 00:58:43 2009 -0700

    tunefs.ocfs2: Don't use the I/O cache in unlocked or online operations.

    It's not safe to cache disk blocks when the filesystem may be live
    elsewhere.  So if we're running any of the "special" operations, ones
    that get the special error codes (SKIPCLUSTER, PERFORM_ONLINE, etc),
    we'll run without the cache.

    Because ocfs2ne runs the special operations first, then closes the
    filesystem and repoens it before trying regular operations, we know that
    regular operations will get the cache if they can safely lock the
    filesystem.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 1fa5d9dea32caf99efb4e0811a48655f24938468
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri May 22 16:50:22 2009 -0700

    fsck.ocfs2: Pre-cache dirblocks before we go through them.

    When we come out of pass 1, o2fsck has a sorted rbtree of dirblock
    addresses.  Pass 2 runs that list and checks each dirblock.  However,
    it currently reads them one block at a time.

    The basic operation of pass 2 is a simple loop that iterates the
    dirblocks in block number order.  It passes the dirblock to a callback
    that does the checking.  This callback reads the dirblock and the inode
    it belongs to.

    I tried three caching approaches:

    1) Walk the dirblocks, collecting adjacent ones into single I/Os.  Read
       them to pre-fill the cache.  When o2fsck_worth_caching() returns
       false, we know we've filled the cache with dirblocks.  Go ahead and
       process that many of them.  Then go back and read the next hunk of
       dirblocks.  Keep repeating this until all dirblocks are processed.

    2) The same as (1), except we pre-cache the inode associated with each
       dirblock as well.

    3) A simpler scheme where we just try to read the current dirblock and
       any adjacent ones following it.  Then we process those blocks.  So
       instead of "fill the cache, then process what's in the cache", this
       is "one read, then process what we read".

    Approach (1) was the clear winner.  Depending on the cache size, (3) was
    either identical or worse than (1).  Approach (2) was just plain worse.
    I think this was due to the seek penalty of going off to get the inode
    while pre-caching.  Without getting the inode, all our reads are in
    ascending order.  Obviously approach (1) has to go get the inode during
    the processing phase, but that doesn't impact the pre-cache reads.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 69223be5af7868605c5d681ad64ccf63838f4858
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri May 22 13:18:51 2009 -0700

    fsck.ocfs2: Pre-cache inodes in reverse order.

    We want the first inodes seen by the inode scan to have a higher
    priority in the cache.  That way they aren't flushed from the cache by
    extent blocks.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 7fd354d5bd63370316088267fb9832800f4c9b53
Author: Joel Becker <joel.becker at oracle.com>
Date:   Thu May 21 13:55:00 2009 -0700

    fsck.ocfs2: Pre-fill the I/O cache with metadata.

    In pass0, we walk all of the suballocators to verify they look OK.  In
    the walk, we read each group descriptor.  Because each group is a linear
    hunk of disk, reading the entire group in one slurp is about the same
    amount of effort for the disk.  The big problem is the seek, not the
    data.  So with almost no impact to pass0, we now pre-fill the I/O cache
    will all of our inodes and metadata blocks.

    In pass1, this should mean almost everything is in cache if we had a big
    enough cache.  If we didn't, oh well.  The worst case is about identical
    to the uncached case.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 1770929e5dfc1f85c8e3f89a1d7f0ecfbd284839
Author: Joel Becker <joel.becker at oracle.com>
Date:   Thu May 21 13:15:55 2009 -0700

    fsck.ocfs2: Use the I/O cache.

    fsck.ocfs2 travels the filesystem multiple times.  The I/O cache should
    make this faster.  Since read-write fsck is only allowed when there are
    no other users or mounters of the device, the cache should be safe.

    We use two caches.  First, we allocate a cache big enough for all the
    journals.  Since we don't know their size at the start, we guess the
    default 256MB.  The hope is that we cache the journal blocks on the
    first pass when we check their contents and avoid having to re-read them
    on the second pass when we replay them.

    Once the journals are replayed, we drop this cache and try to allocate a
    cache equal to the number of blocks in the filesystem.  This should,
    hopefully, keep all of fsck in cache.

    We make sure to mlock() our cache, because it's pointless to swap out
    cache data; we'd rather just read it from the device.  Now, obviously,
    we can't allocate and lock more memory than the system has available.
    fsck will keep shrinking the cache size until it gets an allocation.

    For the main fsck operation, we don't just get the largest cache
    available.  We will need memory for the fsck accounting structures too.
    fsck will start with a cache _larger_ than needed.  If this
    succeeds, fsck knows that the needed size is safe to allocate.  fsck
    will actually use a cache smaller than the largest cache it could get,
    ensuring available memory.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 2acd6abfb2befce43890c8b2ebb6d2807ec58b77
Author: Joel Becker <joel.becker at oracle.com>
Date:   Tue May 26 15:29:35 2009 -0700

    mkfs.ocfs2: Keep the I/O cache across the journal format

    Now that the journal format knows not to pollute the cache, let's just
    keep the cache around.  While we're at it, make sure the cache is big
    enough to hold a suballocator and then some.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit d4704a8fd16d47cce1bf8994e17bd1330199daed
Author: Joel Becker <joel.becker at oracle.com>
Date:   Thu May 21 13:12:16 2009 -0700

    libocfs2: Add io_mlock_cache().

    An I/O cache is pretty useless if it's actually being swapped out of
    RAM.  The io_mlock_cache() call allows a cache user to ensure their
    cache is in RAM.  We don't make it a default part of io_cache_init()
    because some users won't have the privileges to mlock.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit c3f629da65c9371bde00d8ff797f575e28e35837
Author: Joel Becker <joel.becker at oracle.com>
Date:   Tue May 26 15:28:33 2009 -0700

    libocfs2: Don't cache I/O from journal format.

    When we're zeroing a newly formatted journal, we don't want to pollute
    the I/O cache with the zeros.  Set the io_channel to nocache for the
    operation.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 74ba73a56717e9c471e50aa10ec63b9754cd1d30
Author: Joel Becker <joel.becker at oracle.com>
Date:   Tue May 26 15:09:37 2009 -0700

    libocfs2: Allow a global nocache flag on io_channels.

    We've added _nocache() versions of the I/O functions so that smart
    callers can specify when certain I/Os should not pollute the I/O cache.
    However, not all code is smart.  Rather than teach
    ocfs2_file_read/write() to pass another nocache argument, let's give I/O
    channels the knowledge to skip caching.

    The io_set_nocache() function will set or clear a nocache flag on the
    channel.  While set, the channel will use the _nocache() functions for
    I/O (assuming a cache is there).  This preserves the qualities of the
    cache - it's always up to date - but will not pollute it with new
    blocks.  When finished, the caller can io_set_nocache(channel, false)
    and return to using the cache.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit d52ba85de26e3c1588ecbc0b391036570e319834
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri May 22 11:26:38 2009 -0700

    libocfs2: Provide _nocache() versions of the I/O functions.

    Some I/O doesn't want to pollute the cache.  The _nocache() I/O
    functions will not add blocks to the cache.  If the blocks are already
    in the cache, they will make sure they are not broken.  For example, a
    write needs to update an already existing cache block so that the cache
    doesn't have stale data.  The blocks are not removed from the cache -
    they're already there, why make a reader go find them?  They get moved
    to the end of the LRU so that they get stolen first.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit a076fec7dddb201bbe638913f42e006c2bbe2c2e
Author: Joel Becker <joel.becker at oracle.com>
Date:   Wed May 20 19:10:48 2009 -0700

    libocfs2: Large I/Os in the cache.

    Our I/O cache is dumb.  It works one block at a time.  We really want
    large I/Os to go out like that.

    We change the write case to write the I/O first, as big as it can.  Then
    it runs through each completed block and updates the cache.  If there
    was a short write, it will still update the cache for the blocks that
    were written.

    The read code has even more smarts.  First, it checks to see if the
    entire read is in cache.  If not, it does I/O from the start of the
    first uncached block; it skips cached blocks at the front of the buffer.
    Then it runs through each block and syncs the cache to the buffer.

    We do the reads in 1MB hunks.  This gives us the opportunity to check
    for cached blocks every megabyte.  Imagine a 10MB buffer with only one
    uncached block - the very first one.  Doing it all at once will trigger
    a 10MB read.  But doing it in 1MB hunks will read the first 1MB, then
    discover the remaining 9MB are all in cache.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit 170ba4ca7ff71a680fafa2a7a4d4adcb920027cf
Author: Joel Becker <joel.becker at oracle.com>
Date:   Wed May 20 17:47:18 2009 -0700

    libocfs2: ocfs2_read_blocks() should return an errcode_t.

    It was returning -EIO instead of OCFS2_ET_IO.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit af38aa872ddc432688eeb5eaafe918bea65c3584
Author: Joel Becker <joel.becker at oracle.com>
Date:   Wed May 20 17:45:44 2009 -0700

    libocfs2: Use ocfs2_read_blocks() in xattr.c

    Readers need to use ocfs2_read_blocks() so as to resolve image file
    reads.  xattr.c wasn't doing this.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

commit f2e4c143ce8367411c16479a1107367330d39d3b
Author: Joel Becker <joel.becker at oracle.com>
Date:   Thu May 21 18:25:48 2009 -0700

    libocfs2: Catch memalign()s that will abort older glibcs.

    Older glibcs (before 2007/07, this includes the glibc in el5) would
    abort if __libc_memalign() couldn't allocate the memory.  That's
    obviously a bogus behavior, but we have to handle it.

    It's simple, though.  We try with malloc() first.  If that succeeds, we
    know the memory is there and retry with posix_memalign().

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>

-----------------------------------------------------------------------

Summary of changes:
 fsck.ocfs2/dirblocks.c                   |   98 +++++++++-
 fsck.ocfs2/fsck.c                        |   10 +-
 fsck.ocfs2/fsck.ocfs2.checks.8.in        |    7 +
 fsck.ocfs2/icount.c                      |    4 +-
 fsck.ocfs2/include/dirblocks.h           |    5 +-
 fsck.ocfs2/include/util.h                |   27 +++
 fsck.ocfs2/journal.c                     |    2 +-
 fsck.ocfs2/pass0.c                       |  122 ++++++++++--
 fsck.ocfs2/pass1.c                       |    4 +-
 fsck.ocfs2/pass2.c                       |    3 +-
 fsck.ocfs2/util.c                        |  149 +++++++++++++-
 include/ocfs2/ocfs2.h                    |   29 +++-
 libocfs2/memory.c                        |   16 ++-
 libocfs2/mkjournal.c                     |    6 +-
 libocfs2/openfs.c                        |   28 +++-
 libocfs2/unix_io.c                       |  331 ++++++++++++++++++++++++------
 libocfs2/xattr.c                         |    4 +-
 mkfs.ocfs2/mkfs.c                        |   37 +---
 tunefs.ocfs2/feature_inline_data.c       |    3 +-
 tunefs.ocfs2/feature_metaecc.c           |    3 +-
 tunefs.ocfs2/feature_sparse_files.c      |    3 +-
 tunefs.ocfs2/feature_unwritten_extents.c |    3 +-
 tunefs.ocfs2/feature_xattr.c             |    3 +-
 tunefs.ocfs2/libocfs2ne.c                |  104 +++++++++-
 tunefs.ocfs2/libocfs2ne.h                |    2 +
 25 files changed, 863 insertions(+), 140 deletions(-)

hooks/post-receive
-- 
Tools to manage the ocfs2 filesystem.