[Ocfs2-devel] [PATCH 13/15] Enable xattr set in index btree. v3

Mark Fasheh mfasheh at suse.com
Tue Aug 12 13:52:25 PDT 2008


On Tue, Aug 12, 2008 at 09:09:49AM +0800, Tao Ma wrote:
> 
> 
> Mark Fasheh wrote:
> >On Thu, Aug 07, 2008 at 06:31:33AM +0800, Tao Ma wrote:
> >>+/*
> >>+ * Add a new cluster for xattr storage.
> >>+ *
> >>+ * If the new cluster is contiguous with the previous one, it will be
> >>+ * appended to the same extent record, and num_clusters will be updated.
> >>+ * If not, we will insert a new extent for it and move some xattrs in
> >>+ * the last cluster into the new allocated one.
> >>+ * We also need to limit the maximum size of a btree leaf, otherwise 
> >>we'll
> >>+ * lose the benefits of hashing because we'll have to search large 
> >>leaves.
> >>+ * So now the maximum size is OCFS2_MAX_XATTR_TREE_LEAF_SIZE(or 
> >>clustersize,
> >>+ * if it's bigger).
> >>+ *
> >>+ * first_bh is the first block of the previous extent rec and header_bh
> >>+ * indicates the bucket we will insert the new xattrs. They will be 
> >>updated
> >>+ * when the header_bh is moved into the new cluster.
> >>+ */
> >>+static int ocfs2_add_new_xattr_cluster(struct inode *inode,
> >>+				       struct buffer_head *root_bh,
> >>+				       struct buffer_head **first_bh,
> >>+				       struct buffer_head **header_bh,
> >>+				       u32 *num_clusters,
> >>+				       u32 prev_cpos,
> >>+				       u64 prev_blkno,
> >>+				       int *extend)
> >>+{
> >>+	int ret, credits;
> >>+	u16 bpc = ocfs2_clusters_to_blocks(inode->i_sb, 1);
> >>+	u32 prev_clusters = *num_clusters;
> >>+	u32 clusters_to_add = 1, bit_off, num_bits, v_start = 0;
> >>+	u64 block;
> >>+	handle_t *handle = NULL;
> >>+	struct ocfs2_alloc_context *data_ac = NULL;
> >>+	struct ocfs2_alloc_context *meta_ac = NULL;
> >>+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> >>+	struct ocfs2_xattr_block *xb =
> >>+			(struct ocfs2_xattr_block *)root_bh->b_data;
> >>+	struct ocfs2_xattr_tree_root *xb_root = &xb->xb_attrs.xb_root;
> >>+	struct ocfs2_extent_list *root_el = &xb_root->xt_list;
> >>+	enum ocfs2_extent_tree_type type = OCFS2_XATTR_TREE_EXTENT;
> >>+
> >>+	mlog(0, "Add new xattr cluster for %llu, previous xattr hash = %u, "
> >>+	     "previous xattr blkno = %llu\n",
> >>+	     (unsigned long long)OCFS2_I(inode)->ip_blkno,
> >>+	     prev_cpos, prev_blkno);
> >>+
> >>+	ret = ocfs2_lock_allocators(inode, root_bh, root_el,
> >>+				    clusters_to_add, 0, &data_ac,
> >>+				    &meta_ac, type, NULL);
> >>+	if (ret) {
> >>+		mlog_errno(ret);
> >>+		goto leave;
> >>+	}
> >>+
> >>+	credits = ocfs2_calc_extend_credits(osb->sb, root_el, 
> >>clusters_to_add);
> >>+	handle = ocfs2_start_trans(osb, credits);
> >>+	if (IS_ERR(handle)) {
> >>+		ret = PTR_ERR(handle);
> >>+		handle = NULL;
> >>+		mlog_errno(ret);
> >>+		goto leave;
> >>+	}
> >>+
> >>+	ret = ocfs2_journal_access(handle, inode, root_bh,
> >>+				   OCFS2_JOURNAL_ACCESS_WRITE);
> >>+	if (ret < 0) {
> >>+		mlog_errno(ret);
> >>+		goto leave;
> >>+	}
> >>+
> >>+	ret = __ocfs2_claim_clusters(osb, handle, data_ac, 1,
> >>+				     clusters_to_add, &bit_off, &num_bits);
> >>+	if (ret < 0) {
> >>+		if (ret != -ENOSPC)
> >>+			mlog_errno(ret);
> >>+		goto leave;
> >>+	}
> >>+
> >>+	BUG_ON(num_bits > clusters_to_add);
> >>+
> >>+	block = ocfs2_clusters_to_blocks(osb->sb, bit_off);
> >>+	mlog(0, "Allocating %u clusters at block %u for xattr in inode 
> >>%llu\n",
> >>+	     num_bits, bit_off, (unsigned long 
> >>long)OCFS2_I(inode)->ip_blkno);
> >>+
> >>+	if (prev_blkno + prev_clusters * bpc == block &&
> >>+	    (prev_clusters + num_bits) << osb->s_clustersize_bits <=
> >>+	     OCFS2_MAX_XATTR_TREE_LEAF_SIZE) {
> >>+		/*
> >>+		 * If this cluster is contiguous with the old one and
> >>+		 * adding this new cluster, we don't surpass the limit of
> >>+		 * OCFS2_MAX_XATTR_TREE_LEAF_SIZE, cool. We will let it be
> >>+		 * initialized and used like other buckets in the previous
> >>+		 * cluster.
> >>+		 * So add it as a contiguous one. The caller will handle
> >>+		 * its init process.
> >>+		 */
> >>+		v_start = prev_cpos + prev_clusters;
> >>+		*num_clusters = prev_clusters + num_bits;
> >>+		mlog(0, "Add contiguous %u clusters to previous extent 
> >>rec.\n",
> >>+		     num_bits);
> >>+	} else {
> >>+		ret = ocfs2_adjust_xattr_cross_cluster(inode,
> >>+						       handle,
> >>+						       first_bh,
> >>+						       header_bh,
> >>+						       block,
> >>+						       prev_blkno,
> >>+						       prev_clusters,
> >>+						       &v_start,
> >>+						       extend);
> >>+		if (ret) {
> >>+			mlog_errno(ret);
> >>+			goto leave;
> >>+		}
> >>+	}
> >>+
> >>+	mlog(0, "Insert %u clusters at block %llu for xattr at %u\n",
> >>+	     num_bits, block, v_start);
> >>+	ret = ocfs2_xattr_tree_insert_extent(osb, handle, inode, root_bh,
> >>+					     v_start, block, num_bits,
> >>+					     0, meta_ac);
> >
> >I think we need to do something down in the extent insertion code to 
> >prevent
> >merging in the OCFS2_MAX_XATTR_TREE_LEAF_SIZE cases - what's preventing it
> >from appending that allocation to the end of an extent (any extent) in the
> >btree?
> We don't need to do that. You see, in ocfs2_adjust_xattr_corss_cluster I 
> have adjusted v_start(v_start now indicates the first name hash of the 
> xattrs in the new cluster), so now they will not be contiguous now. ;)
> 
> We only allow xattr hashes to be the same in one bucket(4K) size, so 
> with 64K as our leaf size, even if the cluster is contiguous, we will 
> not have contiguous name hash. so it is safe to say here v_start can 
> never be old_v_start+1.

Hmm, ok, and there can never be anything in between v_start and old_v_start
to begin with, since v_start is determined from the values within the old
bucket. Clever  :)

One question though - what's your *guarantee* that v_start can never be
old_v_start + 1.... What if we *only* have hashes with the values
old_v_start and old_v_start+1 in the bucket?

Btw, it might really just have been easier for you to pass something to the
merge code which says "leave as seperate extents this time"  :)
	--Mark

--
Mark Fasheh



More information about the Ocfs2-devel mailing list