[Ocfs2-tools-devel] [PATCH 09/11] ocfs2-tools: add xattr support in fsck.ocfs2
Tao Ma
tao.ma at oracle.com
Tue Mar 3 17:14:16 PST 2009
Joel Becker wrote:
> On Fri, Feb 27, 2009 at 03:54:49PM +0800, Tiger Yang wrote:
>> +static uint16_t detect_xattr_count_in_bucket(struct ocfs2_xattr_header *xh,
>> + struct xattr_info *xi)
>> +{
>> + uint16_t count = 0;
>> + struct ocfs2_xattr_entry *entry = xh->xh_entries;
>> + struct ocfs2_xattr_entry *pre_xe = entry;
>> + uint16_t max_count = (xi->max_offset - HEADER_SIZE) / ENTRY_SIZE;
>> +
>> + if (xh->xh_count == 0 && xh->xh_free_start == OCFS2_XATTR_BUCKET_SIZE)
>> + return 0;
>> +
>> + while (!IS_LAST_ENTRY(entry)) {
>> + /*
>> + * If we could not get the reasonable count,
>> + * we will remove them by setting count to zero
>> + */
>> + if (count >= max_count)
>> + return 0;
>> + /* xattr entries in bucket had sorted by name_hash */
>> + if (entry->xe_name_hash < pre_xe->xe_name_hash)
>> + return count;
>> + count++;
>> + pre_xe = entry;
>> + entry++;
>> + }
>> +
>> + return count;
>> +}
>
> Boy, I really want this to use the pre_xe->xe_name_offset to
> verify that the entry hasn't walked past the free_start. There is no
> guarantee that the start of a name value won't be greater than the
> name_hash of the last valid entry. This would give a false extra entry.
> This is why I wish we had a consistent free_start in all types
> of xattr structures, but it's too late for that. I realize you didn't
> use the xe_name_offset fields here because you haven't validated them
> yet. So on to that part.
Just go through the patch. Here tiger don't use free_start and
name_value_len because he check them after this function. So even with a
consistent free_start and name_value_len, He can't use it here.
>> +static errcode_t check_xattr_entry(o2fsck_state *ost,
>> + struct ocfs2_dinode *di,
>> + struct ocfs2_xattr_header *xh,
>> + int *changed,
>> + struct xattr_info *xi)
>> +{
>> + int i;
>> +
>> + for (i = 0 ; i < xh->xh_count; i++) {
>> + struct ocfs2_xattr_entry *xe = &xh->xh_entries[i];
>> + uint32_t hash;
>> +
>> + /* check and fix name_offset */
>> + if (xe->xe_name_offset >= xi->max_offset) {
>> + if (prompt(ost, PY, PR_XATTR_OFFSET_INVALID,
>> + "Extended attribute entry in %s #%"PRIu64
>> + " refers to an invalid name offset %u,"
>> + " clear this entry?",
>> + xattr_object[xi->location], xi->blkno,
>> + xe->xe_name_offset)) {
>> + /*
>> + * remove the entry and left the name/value,
>> + * because we don't know the correct offset
>> + * of them.
>> + */
>> + memmove((void *)xe, (void *)(xe + 1),
>> + (xh->xh_count - i - 1) * ENTRY_SIZE);
>> + memset(&xh->xh_entries[xh->xh_count - 1], 0,
>> + ENTRY_SIZE);
>> + xh->xh_count -= 1;
>> + i -= 1;
>> + *changed = 1;
>> + continue;
>> + } else
>> + return -1;
>> + }
>
> When you find invalid entries, you leave them in place. This
> means that name_value_len and free_start get out of sync, because
> name_value_len only covers the good values, but free_start covers all
> values. I think the kernel handles this ok because it never compares
> free_start and name_value_len, but what happens if someone doesn't
> realize this and changes it?
> I suppose we all just watch out for that in future patches.
I just saw that tiger adjust these 2 values in check_xattr after
check_xattr_value.
>> +static errcode_t ocfs2_check_xattr_buckets(o2fsck_state *ost,
>> + struct ocfs2_dinode *di,
>> + uint64_t blkno,
>> + uint32_t clusters)
>> +{
>> + int i;
>> + errcode_t ret = 0;
>> + char *bucket = NULL;
>> + char *bucket_buf = NULL;
>> + struct ocfs2_xattr_header *xh;
>> + int blk_per_bucket = ocfs2_blocks_per_xattr_bucket(ost->ost_fs);
>> + uint32_t bpc = ocfs2_xattr_buckets_per_cluster(ost->ost_fs);
>> + uint32_t max_buckets = clusters * bpc;
>> + uint32_t max_blocks = max_buckets * blk_per_bucket;
>> + uint32_t num_buckets = 0;
>> + uint64_t blk = 0;
>> +
>> + /* malloc space for all buckets */
>> + ret = ocfs2_malloc_blocks(ost->ost_fs->fs_io, max_blocks, &bucket);
>> + if (ret) {
>> + com_err(whoami, ret, "while allocating room to read"
>> + " extended attributes bucket");
>> + goto out;
>> + }
>> +
>> + /* read all buckets for detect (some of them may not be used) */
>> + bucket_buf = bucket;
>> + blk = blkno;
>> + for (i = 0; i < max_buckets; i++) {
>> + ret = ocfs2_read_xattr_bucket(ost->ost_fs, blk, bucket_buf);
>> + if (ret) {
>> + com_err(whoami, ret, "while reading bucket of"
>> + " extended attributes ");
>> + goto out;
>> + }
>> + blk += blk_per_bucket;
>> + bucket_buf += OCFS2_XATTR_BUCKET_SIZE;
>> + }
>
> This code reads all the possible buckets in a cluster, but the
> actual number of used buckets might be less. If that's the case,
> ocfs2_read_xattr_bucket() is going to return an error for the invalid
> bucket (at least in the metaecc case). So you can't just exit out.
> Instead, why not just change max_buckets = i and continue on? You'll
> just be validating the buckets you could read, which should, in fact,
> match the buckets you expected.
yes, you are right.
Regards,
Tao
More information about the Ocfs2-tools-devel
mailing list