[Ocfs2-tools-devel] [PATCH 1/1] Add check to compare journal size and cluster size

Srinivas Eeda srinivas.eeda at oracle.com
Wed Apr 29 14:49:51 PDT 2015


Hi Ashish,

can you please resend this patch with smaller description

If user doesn't provide right Journal size, mkfs should fail. Currently it
succeeds but ends up as unusable fs.

1. mkfs.ocfs2 -J size=4M -C 1M -N 2 -L xvdh --cluster-name=ocfs2big
  --cluster-stack=o2cb --force --global-heartbeat /dev/xvdh
2. mount -t ocfs2 /dev/xvdh /mnt

3. mkdir aa
    mkdir: cannot create directory `aa': No space left on device

This happens because of the following:

The kernel allows the maximum transaction buffer to be 1\4 th of the journal
size and this is further divided by 2 for transaction reservation support.
Some operations such as mkdir can require more than a cluster worth of journal
credits. Such operations will fail if the journal size is not greater than
cluster size * 8.

This patch adds this check for user provided values and modifies the default
calculation to account for the same. It will fail mkfs if the conditions are
not met instead of creating a non-functional filesystem.

Thanks,
--Srini

On 04/29/2015 02:37 PM, Ashish Samant wrote:
> Operations such as mkdir can fail if the journal size and cluster size are not
> provided/calculated correctly.
>
> For eg:
> 1.
>   mkfs.ocfs2 -J size=4M -C 1M -N 2 -L xvdh --cluster-name=ocfs2big
>   --cluster-stack=o2cb --force --global-heartbeat /dev/xvdh
> mkfs.ocfs2 1.8.0
> Cluster stack: o2cb
> Cluster name: ocfs2big
> Stack Flags: 0x1
> NOTE: Feature extended slot map may be enabled
> Label: xvdh
> Features: sparse extended-slotmap backup-super unwritten inline-data
> strict-journal-super xattr indexed-dirs refcount discontig-bg
> Block size: 4096 (12 bits)
> Cluster size: 1048576 (20 bits)
> Volume size: 3221225472000 (3072000 clusters) (786432000 blocks)
> Cluster groups: 96 (tail covers 7680 clusters, rest cover 32256 clusters)
> Extent allocator size: 1610612736 (384 groups)
> Journal size: 4194304
> Node slots: 2
> Creating bitmaps: done
> Initializing superblock: done
> Writing system files: done
> Writing superblock: done
> Writing backup superblock: 6 block(s)
> Formatting Journals: done
> Growing extent allocator: done
> Formatting slot map: done
> Formatting quota files: done
> Writing lost+found: done
> mkfs.ocfs2 successful
>
> 2.
> mount -t ocfs2 /dev/xvdh /mnt
>
> The mount succeeds.
>
> 3.
> mkdir aa
> mkdir: cannot create directory `aa': No space left on device
>
> mkdir fails inspite of the colume size being 3T.
>
> This happens because of the following:
>
> The kernel allows the maximum transaction buffer to be 1\4 th of the journal
> size and this is further divided by 2 for transaction reservation support.
> Some operations such as mkdir can require more than a cluster worth of journal
> credits. Such operations will fail if the journal size is not greater than
> cluster size * 8.
>
> This patch adds this check for user provided values and modifies the default
> calculation to account for the same. It will fail mkfs if the conditions are
> not met instead of creating a non-functional filesystem.
>
> Signed-off-by: Ashish Samant <ashish.samant at oracle.com>
> Reviewed-by: Srinivas Eeda <srinivas.eeda at oracle.com>
> ---
>   include/ocfs2-kernel/ocfs2_fs.h |  2 ++
>   mkfs.ocfs2/mkfs.c               | 51 +++++++++++++++++++++++++++++++++++++----
>   2 files changed, 49 insertions(+), 4 deletions(-)
>
> diff --git a/include/ocfs2-kernel/ocfs2_fs.h b/include/ocfs2-kernel/ocfs2_fs.h
> index 79e4f2f..e08d87a 100644
> --- a/include/ocfs2-kernel/ocfs2_fs.h
> +++ b/include/ocfs2-kernel/ocfs2_fs.h
> @@ -309,6 +309,8 @@
>   
>   /* Journal limits (in bytes) */
>   #define OCFS2_MIN_JOURNAL_SIZE		(4 * 1024 * 1024)
> +/* Minimum Journal size shift with respect to cluster size */
> +#define OCFS2_MIN_CLUSTER_TO_JOURNAL_SIZE_SHIFT		3
>   
>   /*
>    * Default local alloc size (in megabytes)
> diff --git a/mkfs.ocfs2/mkfs.c b/mkfs.ocfs2/mkfs.c
> index 7abea73..fb4383a 100644
> --- a/mkfs.ocfs2/mkfs.c
> +++ b/mkfs.ocfs2/mkfs.c
> @@ -1415,10 +1415,30 @@ static unsigned int journal_size_vmstore(State *s)
>   	return 32768;
>   }
>   
> +static int journal_size_valid(unsigned int j_blocks, State *s)
> +{
> +	return (j_blocks * s->initial_slots + 1024) <=
> +		s->volume_size_in_blocks;
> +}
> +
> +/* For operations such as mkdir that can require more than a cluster worth
> + * of journal credits, journal size should be greater than cluster size * 8.
> + * The kernel allows the maximum transaction buffer to be 1\4 th of the
> + * journal size and this is further divided by 2 for transaction
> + * reservation support. We calculate minimum journal size here
> + * accordingly and and ceil w.r.t to the cluster size.*/
> +static unsigned int journal_min_size(uint32_t cluster_size)
> +{
> +	return (cluster_size << OCFS2_MIN_CLUSTER_TO_JOURNAL_SIZE_SHIFT)
> +		+ cluster_size;
> +}
> +
>   /* stolen from e2fsprogs, modified to fit ocfs2 patterns */
>   static uint64_t figure_journal_size(uint64_t size, State *s)
>   {
>   	unsigned int j_blocks;
> +	uint64_t ret;
> +	unsigned int min_journal_size;
>   
>   	if (s->hb_dev)
>   		return 0;
> @@ -1428,19 +1448,27 @@ static uint64_t figure_journal_size(uint64_t size, State *s)
>   		exit(1);
>   	}
>   
> +	min_journal_size = journal_min_size(s->cluster_size);
>   	if (size > 0) {
>   		j_blocks = size >> s->blocksize_bits;
>   		/* mke2fs knows about free blocks at this point, but
>   		 * we don't so lets just take a wild guess as to what
>   		 * the fs overhead we're looking at will be. */
> -		if ((j_blocks * s->initial_slots + 1024) >
> -		    s->volume_size_in_blocks) {
> +		if (!journal_size_valid(j_blocks, s)) {
>   			fprintf(stderr,
>   				"Journal size too big for filesystem.\n");
>   			exit(1);
>   		}
>   
> -		return align_bytes_to_clusters_ceil(s, size);
> +		ret = align_bytes_to_clusters_ceil(s, size);
> +		/* It is better to fail mkfs than to create a non-functional
> +		 * filesystem.*/
> +		if (ret < min_journal_size) {
> +			fprintf(stderr,
> +				"Journal size too small for filesystem.\n");
> +			exit(1);
> +		}
> +		return ret;
>   	}
>   
>   	switch (s->fs_type) {
> @@ -1458,7 +1486,22 @@ static uint64_t figure_journal_size(uint64_t size, State *s)
>   		break;
>   	}
>   
> -	return align_bytes_to_clusters_ceil(s, j_blocks << s->blocksize_bits);
> +	ret = align_bytes_to_clusters_ceil(s, j_blocks << s->blocksize_bits);
> +	/* If the default journal size is less than the minimum required
> +	 * size, set the default to the minimum size. Then fail if
> +	 * the journal size is not valid*/
> +	if (ret < min_journal_size) {
> +		ret = min_journal_size;
> +		j_blocks = ret >> s->blocksize_bits;
> +		if (!journal_size_valid(j_blocks, s)) {
> +			fprintf(stderr,
> +				"Volume size too small for required "
> +				"configuration.\nIncrease volume size or "
> +				"reduce cluster size\n");
> +			exit(1);
> +		}
> +	}
> +	return ret;
>   }
>   
>   static uint32_t cluster_size_default(State *s)




More information about the Ocfs2-tools-devel mailing list