[Ocfs2-commits] branch, master, updated. 21b8b1ccc3e9f592d6d377d4856aff49834c9a25

Thu May 27 05:15:22 PDT 2010

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "The ocfs2 filesystem version 1.8".

The branch, master has been updated
       via  21b8b1ccc3e9f592d6d377d4856aff49834c9a25 (commit)
       via  1932fca067c587c42beb28727c292be8835709ea (commit)
       via  917df45e8efaa7a5735e83407b0bf0a092b7c5d5 (commit)
       via  b4b1bc8b979ab4deae7348a934ce20155d154439 (commit)
       via  e0e80ac0f73a8eccdc18a477b1256cd79647ca52 (commit)
       via  198e1e13a16d886fe44deb358f9713fc6d1e1929 (commit)
       via  0d8cb4276ecceabc35f4c41415ddf09e2ed093d8 (commit)
       via  3584e8b1f8aa0c297c0bda3afc0f381503415512 (commit)
       via  8a0b89801584350ef30a1dcc1c46b9a8be0f4083 (commit)
       via  7a198393f7c0583c88b8f739a954ea3235dc9458 (commit)
       via  c67b59d160f6687a10b4a31c9a58f1063be7203f (commit)
       via  37c224a5c5801cb4d4ccb27c31fb3577e931d332 (commit)
       via  d6e15f7f12bbf9f255f6020e4885e8c841ad25a0 (commit)
       via  e23ea106a8a238eee89702647460d02a34584169 (commit)
       via  372ddc6a4fcf8fe84a172333e91fc281428fa3a8 (commit)
       via  55bc3c2dd13467906e5b0fd38f00bf683620bc68 (commit)
       via  4289dd361add6c5796b8594286a4e2612ca6d839 (commit)
       via  aca2998a9e6c9c8993a02141b7c5aa18d0436683 (commit)
       via  f3be433b793d692dcd194e9bef45eb990d6474df (commit)
       via  9a46f92f31117ccd53c9558e64fc62c4913918b2 (commit)
       via  6a1a53820774189cb8d785e05945104d91b2219e (commit)
       via  0b0d3265d4c03a00c2c9e334992a5fa4ec0f549f (commit)
       via  77831f50a7fd7f3322b240add78f2a7dd54d1c98 (commit)
       via  3c517f300703118dcc93e9c39c61b2eba481c39c (commit)
       via  e3c8b411f77cc49d2e8015150ee527c5d53726ab (commit)
       via  f28e479d7792ed5faef906c26c4936f5cd5df804 (commit)
       via  384c78f272382ed4f79069bd1f64d9c985a23ae9 (commit)
       via  d8a774f04f9f6e9dfc97aef4ec83c58ad5e8a5f4 (commit)
       via  413241afe563b769de5de5976d08351c1cc64f99 (commit)
       via  f61c0787954f9f52a670b3ade6f2e73d2c7c05df (commit)
       via  324afb87c8546ade595290e43e82becbaadcbe3a (commit)
       via  83ac21f02063804985f3c81f583fb899f2543455 (commit)
       via  e92996ec2c84954879db9e5f7bb5cdc09d3b2a7d (commit)
       via  8f73255baf65f6b95d8f77d63c161f6be97fb3fe (commit)
       via  bfb072f48af237e2d311b8a987293cf7b9d849fb (commit)
       via  7d63d40ab1391e09a97b1ae574f6097e86e37324 (commit)
       via  19e3773cb40eae7dd250d6b4e4b10c1ead5c8e6d (commit)
       via  11b6e02d928e14cb6780d204db4c128cca07c19d (commit)
       via  db3dc08afd40bd753ce2ac32cfea881e3ab7e35d (commit)
       via  a5d4d41e670885b2fb883affdb10119a07fff035 (commit)
       via  726dce29fddb0f3db85f55af6ff57b4d49e4d0e8 (commit)
       via  3472a36100d48b43567b82c4b0ca469b9203b8d4 (commit)
       via  4b457cc8ebed8677cb07732365873d5b7ccaf130 (commit)
       via  b2b5d6bb8e43d4b40301d7887fa8ae4b2500723c (commit)
       via  103a71b6e46528326c603be8cdd2b1f13277c62f (commit)
       via  788c0675bc48bcdf02417279b9b6b82e8336b467 (commit)
       via  e39f4e8413d69b70c82562d9f59d19734a467e5c (commit)
      from  441ae21dc39a0c3d8649753ad75cfd12c66945cb (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 21b8b1ccc3e9f592d6d377d4856aff49834c9a25
Author: Joel Becker <joel.becker at oracle.com>
Date:   Tue May 18 16:47:55 2010 -0700

    ocfs2: Silence a gcc warning.

    Mainline commit 18d3a98f3c1b0e27ce026afa4d1ef042f2903726

    ocfs2_block_group_claim_bits() is never called with min_bits=0, but we
    shouldn't leave status undefined if it ever is.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 1932fca067c587c42beb28727c292be8835709ea
Author: Tao Ma <tao.ma at oracle.com>
Date:   Thu May 13 22:49:05 2010 +0800

    ocfs2: Don't retry xattr set in case value extension fails.

    Mainline commit 5f5261acb059f43c7fb9a2fac9d32c6ef4df2ed5

    In normal xattr set, the set sequence is inode, xattr block
    and finally xattr bucket if we meet with a ENOSPC. But there
    is a corner case.
    So consider we will set a xattr whose value will be stored in
    a cluster, and there is no xattr block by now. So we will
    reserve 1 xattr block and 1 cluster for setting it. Now if we
    fail in value extension(in case the volume is almost full and
    we can't allocate the cluster because the check in
    ocfs2_test_bg_bit_allocatable), ENOSPC will be returned. So
    we will try to create a bucket(this time there is a chance that
    the reserved cluster will be used), and when we try value extension
    again, kernel bug happens. We did meet with it. Check the bug below.
    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1251

    This patch just try to avoid this by adding a set_abort in
    ocfs2_xattr_set_ctxt, so in case ENOSPC happens in value extension,
    we will check whether it is caused by the real ENOSPC or just the
    full of inode or xattr block. If it is the first case, we set set_abort
    so that we don't try any further. we are safe to exit directly here
    ince it is really ENOSPC.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 917df45e8efaa7a5735e83407b0bf0a092b7c5d5
Author: Wengang Wang <wen.gang.wang at oracle.com>
Date:   Mon May 17 20:20:44 2010 +0800

    ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break

    Mainline commit d9ef75221a6247b758e1d7e18edb661996e4b7cf

    Currently we process a dirty lockres with the lockres->spinlock taken. While
    during the process, we may need to lock on dlm->ast_lock. This breaks the
    dependency of dlm->ast_lock(lock first) and lockres->spinlock(lock second).

    This patch fixes the problem.
    Since we can't release lockres->spinlock, we have to take dlm->ast_lock
    just before taking the lockres->spinlock and release it after lockres->spinlock
    is released. And use __dlm_queue_bast()/__dlm_queue_ast(), the nolock version,
    in dlm_shuffle_lists(). There are no too many locks on a lockres, so there is no
    performance harm.

    Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit b4b1bc8b979ab4deae7348a934ce20155d154439
Author: Tao Ma <tao.ma at oracle.com>
Date:   Mon May 10 18:09:47 2010 +0800

    ocfs2: Reset xattr value size after xa_cleanup_value_truncate().

    Mainline commit d5a7df0649fa6a1e7800785d760e2c7d7a3204de

    In ocfs2_prepare_xattr_entry, if we fail to grow an existing value,
    xa_cleanup_value_truncate() will leave the old entry in place.  Thus, we
    reset its value size.  However, if we were allocating a new value, we
    must not reset the value size or we will BUG().  This resolves
    oss.oracle.com bug 1247.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit e0e80ac0f73a8eccdc18a477b1256cd79647ca52
Author: Julia Lawall <julia at diku.dk>
Date:   Fri May 14 21:30:48 2010 +0200

    fs/ocfs2/dlm: Use kstrdup

    Mainline commit 316ce2ba8e74a7bb9153b9f93adc883cb1ceb9fd

    Use kstrdup when the goal of an allocation is copy a string into the
    allocated region.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    // <smpl>
    @@
    expression from,to;
    expression flag,E1,E2;
    statement S;
    @@

    -  to = kmalloc(strlen(from) + 1,flag);
    +  to = kstrdup(from, flag);
       ... when != \(from = E1 \| to = E1 \)
       if (to==NULL || ...) S
       ... when != \(from = E2 \| to = E2 \)
    -  strcpy(to, from);
    // </smpl>

    Signed-off-by: Julia Lawall <julia at diku.dk>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 198e1e13a16d886fe44deb358f9713fc6d1e1929
Author: Julia Lawall <julia at diku.dk>
Date:   Tue May 11 20:28:14 2010 +0200

    fs/ocfs2/dlm: Drop memory allocation cast

    Mainline commit 3914ed0cec6532ab4feb202424fc95ad05024497

    Drop cast on the result of kmalloc and similar functions.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    // <smpl>
    @@
    type T;
    @@

    - (T *)
      (\(kmalloc\|kzalloc\|kcalloc\|kmem_cache_alloc\|kmem_cache_zalloc\|
       kmem_cache_alloc_node\|kmalloc_node\|kzalloc_node\)(...))
    // </smpl>

    Signed-off-by: Julia Lawall <julia at diku.dk>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 0d8cb4276ecceabc35f4c41415ddf09e2ed093d8
Author: Tristan Ye <tristan.ye at oracle.com>
Date:   Tue May 11 17:54:45 2010 +0800

    Ocfs2: Optimize punching-hole code.

    Mainline commit c1631d4a484fbb498e35d661f1aebd64c86b66bf

    This patch simplifies the logic of handling existing holes and
    skipping extent blocks and removes some confusing comments.

    The patch survived the fill_verify_holes testcase in ocfs2-test.
    It also passed my manual sanity check and stress tests with enormous
    extent records.

    Currently punching a hole on a file with 3+ extent tree depth was
    really a performance disaster.  It can even take several hours,
    though we may not hit this in real life with such a huge extent
    number.

    One simple way to improve the performance is quite straightforward.
    From the logic of truncate, we can punch the hole from hole_end to
    hole_start, which reduces the overhead of btree operations in a
    significant way, such as tree rotation and moving.

    Following is the testing result when punching hole from 0 to file end
    in bytes, on a 1G file, 1G file consists of 256k extent records, each record
    cover 4k data(just one cluster, clustersize is 4k):

    ===========================================================================
     * Original punching-hole mechanism:
    ===========================================================================

       I waited 1 hour for its completion, unfortunately it's still ongoing.

    ===========================================================================
     * Patched punching-hode mechanism:
    ===========================================================================

       real 0m2.518s
       user 0m0.000s
       sys  0m2.445s

    That means we've gained up to 1000 times improvement on performance in this
    case, whee! It's fairly cool. and it looks like that performance gain will
    be raising when extent records grow.

    The patch was based on my former 2 patches, which were about truncating
    codes optimization and fixup to handle CoW on punching hole.

    Signed-off-by: Tristan Ye <tristan.ye at oracle.com>
    Acked-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 3584e8b1f8aa0c297c0bda3afc0f381503415512
Author: Tristan Ye <tristan.ye at oracle.com>
Date:   Tue May 11 17:54:44 2010 +0800

    Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public.

    Mainline commit ee149a7c6cbaee0e3a1a7d9e9f92711228ef5236

    The original idea to pull ocfs2_find_cpos_for_left_leaf() out of
    alloc.c is to benefit punching-holes optimization patch, it however,
    can also be referred by other funcs in the future who want to do the
    same job.

    Signed-off-by: Tristan Ye <tristan.ye at oracle.com>
    Acked-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 8a0b89801584350ef30a1dcc1c46b9a8be0f4083
Author: Tristan Ye <tristan.ye at oracle.com>
Date:   Tue May 11 17:54:43 2010 +0800

    Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing.

    Mainline commit e8aec068ecb1957630816cfa2c150c6b3ddd1790

    Based on the previous patch of optimizing truncate, the bugfix for
    refcount trees when punching holes can be fairly easy
    and straightforward since most of work we should take into account for
    refcounting have been completed already in ocfs2_remove_btree_range().

    This patch performs CoW for refcounted extents when a hole being punched
    whose start or end offset were in the middle of a cluster, which means
    partial zeroing of the cluster will be performed soon.

    The patch has been tested fixing the following bug:

    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1216

    Signed-off-by: Tristan Ye <tristan.ye at oracle.com>
    Acked-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 7a198393f7c0583c88b8f739a954ea3235dc9458
Author: Tristan Ye <tristan.ye at oracle.com>
Date:   Tue May 11 17:54:42 2010 +0800

    Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead.

    Mainline commit 78f94673d7faf01677f374f4ebbf324ff1a0aa6e

    Truncate is just a special case of punching holes(from new i_size to
    end), we therefore could take advantage of the existing
    ocfs2_remove_btree_range() to reduce the comlexity and redundancy in
    alloc.c.  The goal here is to make truncate more generic and
    straightforward.

    Several functions only used by ocfs2_commit_truncate() will smiply be
    removed.

    ocfs2_remove_btree_range() was originally used by the hole punching
    code, which didn't take refcount trees into account (definitely a bug).
    We therefore need to change that func a bit to handle refcount trees.
    It must take the refcount lock, calculate and reserve blocks for
    refcount tree changes, and decrease refcounts at the end.  We replace
    ocfs2_lock_allocators() here by adding a new func
    ocfs2_reserve_blocks_for_rec_trunc() which accepts some extra blocks to
    reserve.  This will not hurt any other code using
    ocfs2_remove_btree_range() (such as dir truncate and hole punching).

    I merged the following steps into one patch since they may be
    logically doing one thing, though I know it looks a little bit fat
    to review.

    1). Remove redundant code used by ocfs2_commit_truncate(), since we're
        moving to ocfs2_remove_btree_range anyway.

    2). Add a new func ocfs2_reserve_blocks_for_rec_trunc() for purpose of
        accepting some extra blocks to reserve.

    3). Change ocfs2_prepare_refcount_change_for_del() a bit to fit our
        needs.  It's safe to do this since it's only being called by
        truncate.

    4). Change ocfs2_remove_btree_range() a bit to take refcount case into
        account.

    5). Finally, we change ocfs2_commit_truncate() to call
        ocfs2_remove_btree_range() in a proper way.

    The patch has been tested normally for sanity check, stress tests
    with heavier workload will be expected.

    Based on this patch, fixing the punching holes bug will be fairly easy.

    Signed-off-by: Tristan Ye <tristan.ye at oracle.com>
    Acked-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit c67b59d160f6687a10b4a31c9a58f1063be7203f
Author: Joel Becker <joel.becker at oracle.com>
Date:   Thu May 27 19:12:33 2010 +0800

    ocfs2: Block signals for mkdir/link/symlink/O_CREAT.

    Mainline commit 547ba7c8efe43c2cabb38782e23572a6179dd1c1

    Once file or link creation gets going, it can't be interrupted by a
    signal.  They're not idempotent.

    This blocks signals in ocfs2_mknod(), ocfs2_link(), and ocfs2_symlink()
    once we start actually changing things.  ocfs2_mknod() covers mknod(),
    creat(), mkdir(), and open(O_CREAT).

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 37c224a5c5801cb4d4ccb27c31fb3577e931d332
Author: Joel Becker <joel.becker at oracle.com>
Date:   Wed Sep 2 17:17:36 2009 -0700

    ocfs2: Wrap signal blocking in void functions.

    Mainline commit e4b963f10e9026c83419b5c25b93a0350413cf16

    ocfs2 sometimes needs to block signals around dlm operations, but it
    currently does it with sigprocmask().  Even worse, it's checking the
    error code of sigprocmask().  The in-kernel sigprocmask() can only error
    if you get the SIG_* argument wrong.  We don't.

    Wrap the sigprocmask() calls with ocfs2_[un]block_signals().  These
    functions are void, but they will BUG() if somehow sigprocmask() returns
    an error.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit d6e15f7f12bbf9f255f6020e4885e8c841ad25a0
Author: Tao Ma <tao.ma at oracle.com>
Date:   Thu Mar 18 15:54:22 2010 +0800

    ocfs2: enable discontig block group support.

    Mainline commit 1a934c3e57594588c373aea858e4593cdfcba4f4

    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit e23ea106a8a238eee89702647460d02a34584169
Author: Tao Ma <tao.ma at oracle.com>
Date:   Tue Apr 27 08:30:36 2010 +0800

    ocfs2: Set ac_last_group properly with discontig group.

    Mainline commit abf1b3cb5b20fbad27ca9c7497235eeb4dd3f4fd

    ac_last_group is used to record the last block group we
    used during allocation. But the initialization process
    only calls ocfs2_which_suballoc_group and fails to
    use suballoc_loc properly. So let us do it.
    Another function ocfs2_test_suballoc_bit also needs fix.

    I have searched all the callers of ocfs2_which_suballoc_group,
    and all the callers notices suballoc_loc now.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit 372ddc6a4fcf8fe84a172333e91fc281428fa3a8
Author: Tao Ma <tao.ma at oracle.com>
Date:   Mon Mar 22 14:20:18 2010 +0800

    ocfs2: Free block to the right block group.

    Mainline commit 74380c479ad83addeff8a172ab95f59557b5b0c3

    In case the block we are going to free is allocated from
    a discontiguous block group, we have to use suballoc_loc
    to be the right group.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit 55bc3c2dd13467906e5b0fd38f00bf683620bc68
Author: Tao Ma <tao.ma at oracle.com>
Date:   Mon May 17 15:14:17 2010 +0800

    ocfs2: Add ocfs2_gd_is_discontig.

    Mainline commit af2bf0d86019e0b0306965321096f8380b7ca830

    Add ocfs2_gd_is_discontig so that we can test whether
    a group descriptor is discontiguous or not.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit 4289dd361add6c5796b8594286a4e2612ca6d839
Author: Tao Ma <tao.ma at oracle.com>
Date:   Tue Apr 13 14:38:06 2010 +0800

    ocfs2: ocfs2_group_bitmap_size has to handle old volume.

    Mainline commit 8571882c21e5073b2f96147ec4ff9b7042339e1b

    ocfs2_group_bitmap_size has to handle the case when the
    volume don't have discontiguous block group support. So
    pass the feature_incompat in and check it.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit aca2998a9e6c9c8993a02141b7c5aa18d0436683
Author: Tao Ma <tao.ma at oracle.com>
Date:   Thu Apr 22 14:09:15 2010 +0800

    ocfs2: Some tiny bug fixes for discontiguous block allocation.

    Mainline commit 4711954eaa8d30f653fda238cecf919f1ae40d6f

    The fixes include:
    1. some endian problems.
    2. we should use bit/bpc in ocfs2_block_group_grow_discontig to
       allocate clusters.
    3. set num_clusters properly in __ocfs2_claim_clusters.
    4. change name from ocfs2_supports_discontig_bh to
       ocfs2_supports_discontig_bg.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit f3be433b793d692dcd194e9bef45eb990d6474df
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Mar 26 10:10:08 2010 +0800

    ocfs2: Don't relink cluster groups when allocating discontig block groups

    Mainline commit 95ec0adf0b56d6a3f0ca1ec87173311898486b2e

    We don't have enough credits, and the filesystem is in a full state
    anyway.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 9a46f92f31117ccd53c9558e64fc62c4913918b2
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Mar 26 10:09:29 2010 +0800

    ocfs2: Grow discontig block groups in one transaction.

    Mainline commit 8b06bc592ebc5a31e8d0b9c2ab17c6e78dde1f86

    Rather than extending the transaction every time we add an extent to a
    discontiguous block group, we grab enough credits to fill the extent
    list up front.  This means we can free the bits in the same transaction
    if we end up not getting enough space.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 6a1a53820774189cb8d785e05945104d91b2219e
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Mar 26 10:09:15 2010 +0800

    ocfs2: Set suballoc_loc on allocated metadata.

    Mainline commit 2b6cb576aa80611f1f6a3c88708d1e68a8d97985

    Get the suballoc_loc from ocfs2_claim_new_inode() or
    ocfs2_claim_metadata().  Store it on the appropriate field of the block
    we just allocated.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 0b0d3265d4c03a00c2c9e334992a5fa4ec0f549f
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Mar 26 10:08:59 2010 +0800

    ocfs2: Return allocated metadata blknos on the ocfs2_suballoc_result.

    Mainline commit ba2066351b630f0205ebf725f5c81a2a07a77cd7

    Rather than calculating the resulting block number, return it on the
    ocfs2_suballoc_result structure.  This way we can calculate block
    numbers for discontiguous block groups.

    Cluster groups keep doing it the old way.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 77831f50a7fd7f3322b240add78f2a7dd54d1c98
Author: Joel Becker <joel.becker at oracle.com>
Date:   Thu May 6 13:59:06 2010 +0800

    ocfs2: ocfs2_claim_*() don't need an ocfs2_super argument.

    Mainline commit 1ed9b777f77929ae961d6f9cdf828a07200ba71c

    They all take an ocfs2_alloc_context, which has the allocation inode.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit 3c517f300703118dcc93e9c39c61b2eba481c39c
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Mar 26 10:08:27 2010 +0800

    ocfs2: Trim suballocations if they cross discontiguous regions

    Mainline commit 13e434cf0cacd2f03a7f4cd077e3e995ef5ef710

    A discontiguous block group can find a range of free bits that straddle
    more than one region of its space.  Callers can't handle that, so we
    trim the returned bits until they fit within one region.

    Only cluster allocations ask for min_bits>1.  Discontiguous block groups
    are only for block allocations.  So min_bits doesn't matter here.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit e3c8b411f77cc49d2e8015150ee527c5d53726ab
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Mar 26 10:08:07 2010 +0800

    ocfs2: ocfs2_claim_suballoc_bits() doesn't need an osb argument.

    Mainline commit aa8f8e93c898a0319bcd6c79a9a42fe52abac7d7

    It's contained on ac->ac_inode->i_sb anyway.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit f28e479d7792ed5faef906c26c4936f5cd5df804
Author: Joel Becker <joel.becker at oracle.com>
Date:   Fri Mar 26 10:07:42 2010 +0800

    ocfs2: Add suballoc_loc to metadata blocks.

    Mainline commit 9cbc01231e82f9390edaea2b766abcb7165dc4b2

    We need a suballoc_loc field on any suballocated block.  Define them.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 384c78f272382ed4f79069bd1f64d9c985a23ae9
Author: Joel Becker <joel.becker at oracle.com>
Date:   Tue Apr 13 14:30:19 2010 +0800

    ocfs2: Pass suballocation results back via a structure.

    Mainline commit 7d1fe093bf04124dcc50c5dde1765bd098464bfa

    We're going to be adding more info to a suballocator allocation.  Rather
    than growing every function in the chain, let's pass a result structure
    around.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit d8a774f04f9f6e9dfc97aef4ec83c58ad5e8a5f4
Author: Joel Becker <joel.becker at oracle.com>
Date:   Tue Apr 13 14:26:32 2010 +0800

    ocfs2: Allocate discontiguous block groups.

    Mainline commit 798db35f4649eac2778381c390ed7d12de9ec767

    If we cannot get a contiguous region for a block group, allocate a
    discontiguous one when the filesystem supports it.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit 413241afe563b769de5de5976d08351c1cc64f99
Author: Joel Becker <joel.becker at oracle.com>
Date:   Tue Apr 13 14:26:12 2010 +0800

    ocfs2: Define data structures for discontiguous block groups.

    Mainline commit 4cbe4249d6586d5d88ef271e07302407a14c8443

    Defines the OCFS2_FEATURE_INCOMPAT_DISCONTIG_BG feature bit and modifies
    struct ocfs2_group_desc for the feature.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>
    Signed-off-by: Tao Ma <tao.ma at oracle.com>

commit f61c0787954f9f52a670b3ade6f2e73d2c7c05df
Author: Sunil Mushran <sunil.mushran at oracle.com>
Date:   Wed May 5 16:25:08 2010 -0700

    ocfs2/dlm: Increase o2dlm lockres hash size

    Mainline commit 0467ae954d1843de65e7cf8f706f88fe65cd8418

    Lockres hash size of 16KB is far too small for large filesystems (where we
    have hundreds of thousands of lock resources stored in the table).
    This patch increases it to 128KB.

    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 324afb87c8546ade595290e43e82becbaadcbe3a
Author: Tao Ma <tao.ma at oracle.com>
Date:   Mon Apr 26 14:34:57 2010 +0800

    ocfs2: Make ocfs2_extend_trans() really extend.

    Mainline commit c901fb00731e307c2c6e8c7d5eee005df5835f9d

    In ocfs2, we use ocfs2_extend_trans() to extend a journal handle's
    blocks. But if jbd2_journal_extend() fails, it will only restart
    with the the new number of blocks.  This tends to be awkward since
    in most cases we want additional reserved blocks. It makes our code
    harder to mantain since the caller can't be sure all the original
    blocks will not be accessed and dirtied again.  There are 15 callers
    of ocfs2_extend_trans() in fs/ocfs2, and 12 of them have to add
    h_buffer_credits before they call ocfs2_extend_trans().  This makes
    ocfs2_extend_trans() really extend atop the original block count.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 83ac21f02063804985f3c81f583fb899f2543455
Author: Tao Ma <tao.ma at oracle.com>
Date:   Tue Apr 6 16:46:46 2010 +0800

    ocfs2/trivial: Code cleanup for allocation reservation.

    Mainline commit 3e4218df3176657be72ad2fa199779be6c11fe4f

    Two tiny cleanup for allocation reservation.
    1. Remove some extra codes in ocfs2_local_alloc_find_clear_bits.
    2. Remove an unuseful variables in ocfs2_find_resv_lhs.

    Signed-off-by: Tao Ma <tao.ma at oracle.com>
    Acked-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit e92996ec2c84954879db9e5f7bb5cdc09d3b2a7d
Author: Tao Ma <tao.ma at oracle.com>
Date:   Thu Apr 8 16:33:02 2010 +0800

    ocfs2: make ocfs2_adjust_resv_from_alloc simple.

    Mainline commit b065556a7d1a9205403db77a318a5c5aa530e701

    When we allocate some bits from the reservation, we always
    allocate from the r_start(see ocfs2_resmap_resv_bits).
    So there should be no reason to check between r_start
    and start. And I don't think we will change this behaviour
    later by allocating from some bits after r_start.  Why not make
    ocfs2_adjust_resv_from_alloc simple for now?

    The only chance we have to adjust the reservation is when we haven't
    reached the end. With this patch, the function is more readable.

    Note:
    btw, this patch also fixes an original bug in the function
    which I haven't found before.
    	if (end < ocfs2_resv_end(resv))
    		rhs = end - ocfs2_resv_end(resv);
    This code is of course buggy. ;)

    Signed-off-by: Tao Ma <tao.ma at oracle.com>
    Acked-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 8f73255baf65f6b95d8f77d63c161f6be97fb3fe
Author: Sunil Mushran <sunil.mushran at oracle.com>
Date:   Tue Apr 13 18:00:31 2010 -0700

    ocfs2: Make nointr a default mount option

    Mainline commit 4b37fcb7d41ce3b9264b9562d6ffd62db9294bd1

    OCFS2 has never really supported intr. This patch acknowledges this reality
    and makes nointr the default mount option. In a later patch, we intend to
    support intr.

    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit bfb072f48af237e2d311b8a987293cf7b9d849fb
Author: Sunil Mushran <sunil.mushran at oracle.com>
Date:   Tue Apr 13 18:00:30 2010 -0700

    ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE

    Mainline commit 5c80d4c9e5489d5930412add87501702fe5f93fb

    o2dlm join and leave messages are more than informational as they are
    required for debugging locking issues. This patch changes them from
    KERN_INFO to KERN_NOTICE.

    Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 7d63d40ab1391e09a97b1ae574f6097e86e37324
Author: Srinivas Eeda <srinivas.eeda at oracle.com>
Date:   Wed Mar 31 14:32:29 2010 -0700

    o2net: log socket state changes

    Mainline commit 23fd9abdc8f63c72fe3324e83d454ccecedaec37

    This patch logs socket state changes that lead to socket shutdown.

    Signed-off-by: Srinivas Eeda <srinivas.eeda at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 19e3773cb40eae7dd250d6b4e4b10c1ead5c8e6d
Author: Wengang Wang <wen.gang.wang at oracle.com>
Date:   Tue Mar 30 12:09:22 2010 +0800

    ocfs2: print node # when tcp fails

    Mainline commit a5196ec5ef80309fd390191c548ee1f2e8a327ee

    Print the node number of a peer node if sending it a message failed.

    Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 11b6e02d928e14cb6780d204db4c128cca07c19d
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Mon Apr 5 18:17:16 2010 -0700

    ocfs2: Add dir_resv_level mount option

    Mainline commit 83f92318fa33cc084e14e64dc903e605f75884c1

    The default behavior for directory reservations stays the same, but we add a
    mount option so people can tweak the size of directory reservations
    according to their workloads.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit db3dc08afd40bd753ce2ac32cfea881e3ab7e35d
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Mon Apr 5 18:17:15 2010 -0700

    ocfs2: change default reservation window sizes

    Mainline commit b07f8f24dfe54da0f074b78949044842e8df881f

    The default reservation size of 4 (32-bit windows) is a bit too ambitious.
    Scale it back to 16 bits (resv_level=2). I have been testing various sizes
    on a 4-node cluster which runs a mixed workload that is heavily threaded.
    With a 256MB local alloc, I get *roughly* the following levels of average file
    fragmentation:

    resv_level=0	70%
    resv_level=1	21%
    resv_level=2	23%
    resv_level=3	24%
    resv_level=4	60%
    resv_level=5	did not test
    resv_level=6	60%

    resv_level=2 seemed like a good compromise between not letting windows be
    too small, but not so big that heavier workloads will immediately suffer
    without tuning.

    This patch also change the behavior of directory reservations - they now
    track file reservations.  The previous compromise of giving directory
    windows only 8 bits wound up fragmenting more at some window sizes because
    file allocations had smaller unused windows to poach from.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit a5d4d41e670885b2fb883affdb10119a07fff035
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Mon Apr 5 18:17:14 2010 -0700

    ocfs2: increase the default size of local alloc windows

    Mainline commit 6b82021b9e91cd689fdffadbcdb9a42597bbe764

    I have observed that the current size of 8M gives us pretty poor
    fragmentation on multi-threaded workloads which do lots of writes.

    Generally, I can increase the size of local alloc windows and observe a
    marked decrease in fragmentation, even up and beyond window sizes of 512
    megabytes. This makes sense for a couple reasons - larger local alloc means
    more room for reservation windows. On multi-node workloads the larger local
    alloc helps as well because we don't have to do window slides as often.

    Also, I removed the OCFS2_DEFAULT_LOCAL_ALLOC_SIZE constant as it is no
    longer used and the comment above it was out of date.

    To test fragmentation, I used a workload which launched 4 threads that did
    4k writes into a series of about 140 alternating files.

    With resv_level=2, and a 4k/4k file system I observed the following average
    fragmentation for various localalloc= parameters:

    localalloc=	avg. fragmentation
    	8		48
    	32		16
    	64		10
    	120		7

    On larger cluster sizes, the difference is more dramatic.

    The new default size top out at 256M, which we'll only get for cluster
    sizes of 32K and above.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 726dce29fddb0f3db85f55af6ff57b4d49e4d0e8
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Mon Apr 5 18:17:13 2010 -0700

    ocfs2: clean up localalloc mount option size parsing

    Mainline commit 73c8a80003d13be54e2309865030404441075182

    This patch pulls the local alloc sizing code into localalloc.c and provides
    a callout to it from ocfs2_fill_super(). Behavior is essentially unchanged
    except that I correctly calculate the maximum local alloc size. The old code
    in ocfs2_parse_options() calculated the max size as:

    ocfs2_local_alloc_size(sb) * 8

    which is correct, in bits. Unfortunately though the option passed in is in
    megabytes. Ultimately, this bug made no real difference - the shrink code
    would catch a too-large size and bring it down to something reasonable.
    Still, it's less than efficient as-is.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>
    Signed-off-by: Joel Becker <joel.becker at oracle.com>

commit 3472a36100d48b43567b82c4b0ca469b9203b8d4
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Tue Mar 16 21:01:00 2010 -0700

    ocfs2: remove ocfs2_local_alloc_in_range()

    Mainline commit a57c8fd2ad238258cc983049008aea5f985804b2

    Inodes are always allocated from the global bitmap now so we don't need this
    any more. Also, the existing implementation bounces reservations around
    needlessly.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>

commit 4b457cc8ebed8677cb07732365873d5b7ccaf130
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Wed Feb 24 13:34:09 2010 -0800

    ocfs2: allocate btree internal block groups from the global bitmap

    Mainline commit 33d5d380d667ad264675cfdb297dfc3c5b6542cc

    Otherwise, the need for a very large contiguous allocation tends to
    wreak havoc on many inode allocation reservations on the local alloc, thus
    ruining any chances for contiguousness.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>

commit b2b5d6bb8e43d4b40301d7887fa8ae4b2500723c
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Mon Dec 7 13:16:07 2009 -0800

    ocfs2: use allocation reservations for directory data

    Mainline commit e3b4a97dbe9741a3227c3ed857a0632532fcd386

    Use the reservations system for unindexed dir tree allocations. We don't
    bother with the indexed tree as reads from it are mostly random anyway.
    Directory reservations are marked seperately, to allow the reservations code
    a chance to optimize their window sizes. This patch allocates only 8 bits
    for directory windows as they generally are not expected to grow as quickly
    as file data. Future improvements to dir window sizing can trivially be
    made.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>

commit 103a71b6e46528326c603be8cdd2b1f13277c62f
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Thu May 27 18:19:23 2010 +0800

    ocfs2: use allocation reservations during file write

    Mainline commit 4fe370afaae49c57619bb0bedb75de7e7c168308

    Add a per-inode reservations structure and pass it through to the
    reservations code.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>

commit 788c0675bc48bcdf02417279b9b6b82e8336b467
Author: Mark Fasheh <mfasheh at suse.com>
Date:   Thu May 27 17:04:39 2010 +0800

    ocfs2: allocation reservations

    Mainline commit d02f00cc057809d96c044cc72d5b9809d59f7d49

    This patch improves Ocfs2 allocation policy by allowing an inode to
    reserve a portion of the local alloc bitmap for itself. The reserved
    portion (allocation window) is advisory in that other allocation
    windows might steal it if the local alloc bitmap becomes
    full. Otherwise, the reservations are honored and guaranteed to be
    free. When the local alloc window is moved to a different portion of
    the bitmap, existing reservations are discarded.

    Reservation windows are represented internally by a red-black
    tree. Within that tree, each node represents the reservation window of
    one inode. An LRU of active reservations is also maintained. When new
    data is written, we allocate it from the inodes window. When all bits
    in a window are exhausted, we allocate a new one as close to the
    previous one as possible. Should we not find free space, an existing
    reservation is pulled off the LRU and cannibalized.

    Signed-off-by: Mark Fasheh <mfasheh at suse.com>

commit e39f4e8413d69b70c82562d9f59d19734a467e5c
Author: Joel Becker <joel.becker at oracle.com>
Date:   Thu May 27 17:27:22 2010 +0800

    ocfs2: Make ocfs2_journal_dirty() void.

    Mainline commit ec20cec7a351584ca6c70ead012e73d61f9a8e04

    jbd[2]_journal_dirty_metadata() only returns 0.  It's been returning 0
    since before the kernel moved to git.  There is no point in checking
    this error.

    ocfs2_journal_dirty() has been faithfully returning the status since the
    beginning.  All over ocfs2, we have blocks of code checking this can't
    fail status.  In the past few years, we've tried to avoid adding these
    checks, because they are pointless.  But anyone who looks at our code
    assumes they are needed.

    Finally, ocfs2_journal_dirty() is made a void function.  All error
    checking is removed from other files.  We'll BUG_ON() the status of
    jbd2_journal_dirty_metadata() just in case they change it someday.  They
    won't.

    Signed-off-by: Joel Becker <joel.becker at oracle.com>

-----------------------------------------------------------------------

Summary of changes:
 fs/ocfs2/Makefile          |    4 +-
 fs/ocfs2/alloc.c           |  908 +++++++++++---------------------------------
 fs/ocfs2/alloc.h           |   12 +-
 fs/ocfs2/aops.c            |    3 +
 fs/ocfs2/cluster/masklog.c |    1 +
 fs/ocfs2/cluster/masklog.h |    1 +
 fs/ocfs2/cluster/tcp.c     |    3 +
 fs/ocfs2/dir.c             |   75 ++---
 fs/ocfs2/dlm/dlmast.c      |    8 +-
 fs/ocfs2/dlm/dlmcommon.h   |    4 +-
 fs/ocfs2/dlm/dlmconvert.c  |    4 +-
 fs/ocfs2/dlm/dlmdomain.c   |   28 +-
 fs/ocfs2/dlm/dlmlock.c     |    6 +-
 fs/ocfs2/dlm/dlmmaster.c   |   30 +-
 fs/ocfs2/dlm/dlmrecovery.c |   27 +-
 fs/ocfs2/dlm/dlmthread.c   |   16 +-
 fs/ocfs2/dlm/dlmunlock.c   |    3 +-
 fs/ocfs2/file.c            |  215 +++++++++---
 fs/ocfs2/inode.c           |   44 +--
 fs/ocfs2/inode.h           |    2 +
 fs/ocfs2/journal.c         |   26 +-
 fs/ocfs2/journal.h         |   15 +-
 fs/ocfs2/localalloc.c      |  275 ++++++++++----
 fs/ocfs2/localalloc.h      |    3 +
 fs/ocfs2/mmap.c            |   48 +--
 fs/ocfs2/namei.c           |   91 ++---
 fs/ocfs2/ocfs2.h           |   22 +
 fs/ocfs2/ocfs2_fs.h        |  144 ++++++--
 fs/ocfs2/quota_global.c    |    4 +-
 fs/ocfs2/quota_local.c     |   50 +--
 fs/ocfs2/refcounttree.c    |   74 ++---
 fs/ocfs2/refcounttree.h    |    4 +-
 fs/ocfs2/reservations.c    |  847 +++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/reservations.h    |  159 ++++++++
 fs/ocfs2/resize.c          |   19 +-
 fs/ocfs2/suballoc.c        |  688 ++++++++++++++++++++++-----------
 fs/ocfs2/suballoc.h        |   21 +-
 fs/ocfs2/super.c           |   88 ++++-
 fs/ocfs2/super.h           |    7 +
 fs/ocfs2/xattr.c           |  103 ++---
 40 files changed, 2590 insertions(+), 1492 deletions(-)
 create mode 100644 fs/ocfs2/reservations.c
 create mode 100644 fs/ocfs2/reservations.h

hooks/post-receive
-- 
The ocfs2 filesystem version 1.8