[Ocfs2-tools-devel] [PATCH]: Add removing slots support for ocfs2-tools

Sunil Mushran Sunil.Mushran at oracle.com
Thu Jun 7 15:38:37 PDT 2007


Tao,

1. ocfs2_head_1.patch
looks good


2. debugfs_1.patch
looks good


3. remove_slot_1.patch

a. In remove_slot_check:
+       /* we don't allow remove_slot to coexist with other tunefs
+        * options to keep things simple.
+        */
Can we move this check to get_options(). Similar checks are already there.

I have lot more comments for remove_slots(). But I am still going thru 
the code.

4. tunefs_1.patch

a. get_options has code to not allow < 2 slots. Make that 1.

b. Giving same number of nodes as configured should not be an error.
Instead ignore the requested change.

c. In main:
+       /* Set remove slots incompat flag on superblock */
I am sure we can reduce the code size by "merging" the code below with
the resize_inprog.

d. Fail if someone runs tunefs after a failed remove_slot without 
running fsck.ocfs2.


5. group_check.patch
6. orphan_check.patch
7. journal_check.patch
This three will be reviewed after I finish with remove_slots().

++ Begin nag
Add comments atop each patch. ;)
++ End nag

Overall, great work. I especially like the fact that the patch is
"functionally" complete.

Sunil


tao.ma wrote:
> tunefs.ocfs2 has been able to increase the slot for a long time. Now 
> the support for removing slots is also added. There are also some 
> changes in debugfs.ocfs2 and fsck.ocfs2 to be fit for this new feature.
>
> The patch is processed by quilt. and the general description is as 
> follows and any comments are welcome.
> 1. ocfs2_head_1.patch adds some macros in the header files.
> 2. debugfs_1.patch adds some output for "stats".
> 3. remove_slot_1.patch adds slot remove mechanism in tunefs.ocfs2.
> 4. tunefs_1.patch incorporate slot remove into tunefs.ocfs2.
> 5. group_check.patch adds group check mechanism in fsck.ocfs2 in case 
> of bad slot remove.
> 6. orphan_check.patch adds orphan dir content check in fsck.ocfs2 in 
> case of bad slot remove..
> 7. journal_check.patch adds journal content check in fsck.ocfs2.
>
> I have also written some test scripts and programs to test the 
> function and recovery. It works OK by now. They will also be sent out 
> soon, but I think this series should be sent out  sooner rather than 
> later for review.
> ------------------------------------------------------------------------
>
> Index: new.ocfs2-tools/debugfs.ocfs2/utils.c
> ===================================================================
> --- new.ocfs2-tools.orig/debugfs.ocfs2/utils.c	2007-06-06 11:13:36.000000000 -0400
> +++ new.ocfs2-tools/debugfs.ocfs2/utils.c	2007-06-06 11:20:06.000000000 -0400
> @@ -42,10 +42,14 @@ void get_incompat_flag(uint32_t flag, GS
>  	if (flag & OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC)
>  		g_string_append(str, "Sparse ");
>  
> +	if (flag & OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG)
> +		g_string_append(str, "AbortedSlotRemove ");
> +
>  	if (flag & ~(OCFS2_FEATURE_INCOMPAT_HEARTBEAT_DEV |
>  		     OCFS2_FEATURE_INCOMPAT_RESIZE_INPROG |
>  		     OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT |
> -		     OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC))
> +		     OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC |
> +		     OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG))
>  		g_string_append(str, "Unknown ");
>  
>  	if (!str->len)
>   
> ------------------------------------------------------------------------
>
> Index: new.ocfs2-tools/fsck.ocfs2/fsck.ocfs2.checks.8.in
> ===================================================================
> --- new.ocfs2-tools.orig/fsck.ocfs2/fsck.ocfs2.checks.8.in	2007-06-06 11:13:33.000000000 -0400
> +++ new.ocfs2-tools/fsck.ocfs2/fsck.ocfs2.checks.8.in	2007-06-06 11:20:06.000000000 -0400
> @@ -150,6 +150,13 @@ inode that doesn't match the descriptor'
>  Answering yes updates the group descriptor's parent pointer to match the inode
>  it resides in.
>  
> +.SS "GROUP_DUPLICATE"
> +Group descriptors contain a pointer to the allocator inode which contains
> +the chain they belong to.  A group descriptor was found in two allocator
> +inodes so it may be duplicated.
> +
> +Answering yes removes the group descriptor from current allocator inode.
> +
>  .SS "GROUP_BLKNO"
>  Group descriptors have a field which records their block location on disk.  A
>  group descriptor was found at a given location but is recorded as being
> @@ -657,6 +664,21 @@ Answering yes will refresh the superbloc
>  only disable the copying of the backup superblock and will not effect the
>  remaining \fIfsck.ocfs2\fR processing.
>  
> +.SS "ORPHAN_DIR_MISSING"
> +While files are being deleted they are placed in an internal directory, named
> +orphan directory. If an orphan directory does't exist, an OCFS2 volume can't
> +be mounted successfully. Fsck has found the orphan directory is missing and
> +would like to create it for future use.
> +
> +Answering yes creates the orphan directory in the system directory.
> +
> +.SS "JOURNAL_FILE_EMPTY"
> +OCFS2 uses JDB for journalling and some journal files exist in the
> +system directory. Fsck has found some journal file is empty and would
> +like to extend it for future use.
> +
> +Answering yes extends the journal file in the system directory.
> +
>  .SH "SEE ALSO"
>  .BR fsck.ocfs2(8)
>  
> Index: new.ocfs2-tools/fsck.ocfs2/pass0.c
> ===================================================================
> --- new.ocfs2-tools.orig/fsck.ocfs2/pass0.c	2007-06-06 11:13:33.000000000 -0400
> +++ new.ocfs2-tools/fsck.ocfs2/pass0.c	2007-06-06 14:05:32.000000000 -0400
> @@ -91,11 +91,76 @@ static void find_max_free_bits(struct oc
>  	}
>  }
>  
> +/* check whether the group really exists in the specified chain of
> + * the specified allocator file.
> + */
> +static errcode_t check_group_parent(ocfs2_filesys *fs, uint64_t group,
> +				    uint64_t ino, uint16_t chain,int *exist)
> +{
> +	errcode_t ret;
> +	uint64_t gd_blkno;
> +	char *buf = NULL, *gd_buf = NULL;
> +	struct ocfs2_dinode *di = NULL;
> +	struct ocfs2_group_desc *gd = NULL;
> +	struct ocfs2_chain_rec *cr = NULL;
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &buf);
> +	if (ret)
> +		goto out;
> +
> +	ret = ocfs2_read_inode(fs, ino, buf);
> +	if (ret) {
> +		goto out;
> +	}
> +
> +	di = (struct ocfs2_dinode *)buf;
> +
> +	if (!(di->i_flags & OCFS2_VALID_FL) ||
> +	    !(di->i_flags & OCFS2_BITMAP_FL) ||
> +	    !(di->i_flags & OCFS2_CHAIN_FL))
> +		goto out;
> +
> +	if (di->id1.bitmap1.i_total == 0)
> +		goto out;
> +
> +	if (di->id2.i_chain.cl_next_free_rec <= chain)
> +		goto out;
> +
> +	cr = &di->id2.i_chain.cl_recs[chain];
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &gd_buf);
> +	if (ret)
> +		goto out;
> +
> +	gd_blkno = cr->c_blkno;
> +	while (gd_blkno) {
> +		if (gd_blkno ==  group) {
> +			*exist = 1;
> +			break;
> +		}
> +
> +		ret = ocfs2_read_group_desc(fs, gd_blkno, gd_buf);
> +		if (ret)
> +			goto out;
> +		gd = (struct ocfs2_group_desc *)gd_buf;
> +
> +		gd_blkno = gd->bg_next_group;
> +	}
> +
> +out:
> +	if (gd_buf)
> +		ocfs2_free(&gd_buf);
> +	if (buf)
> +		ocfs2_free(&buf);
> +	return ret;
> +}
> +
>  static errcode_t repair_group_desc(o2fsck_state *ost,
>  				   struct ocfs2_dinode *di,
>  				   struct chain_state *cs,
>  				   struct ocfs2_group_desc *bg,
> -				   uint64_t blkno)
> +				   uint64_t blkno,
> +				   int *clear_ref)
>  {
>  	errcode_t ret = 0;
>  	int changed = 0;
> @@ -121,14 +186,35 @@ static errcode_t repair_group_desc(o2fsc
>  	/* XXX maybe for advanced pain we could check to see if these 
>  	 * kinds of descs have valid generations for the inodes they
>  	 * reference */
> -	if ((bg->bg_parent_dinode != di->i_blkno) &&
> -	    prompt(ost, PY, PR_GROUP_PARENT,
> +	if ((bg->bg_parent_dinode != di->i_blkno)) {
> +		int exist = 0;
> +		ret = check_group_parent(ost->ost_fs, bg->bg_blkno,
> +					 bg->bg_parent_dinode,
> +					 bg->bg_chain, &exist);
> +
> +		/* If we finds that the group really exists in the specified
> +		 * chain of the specified alloc inode, then this may be a
> +		 * duplicated group and we may need to remove it from current
> +		 * inode.
> +		 */
> +		if (!ret && exist && prompt(ost, PY, PR_GROUP_DUPLICATE,
> +		   "Group descriptor at block %"PRIu64" is "
> +		   "referenced by inode %"PRIu64" but thinks its parent inode "
> +		   "is %"PRIu64" and we can also see it in that inode."
> +		    " So it may be duplicated.  Remove it from this inode?",
> +		    blkno, di->i_blkno, bg->bg_parent_dinode)) {
> +			*clear_ref = 1;
> +			goto out;
> +		}
> +
> +		if (prompt(ost, PY, PR_GROUP_PARENT,
>  		   "Group descriptor at block %"PRIu64" is "
>  		   "referenced by inode %"PRIu64" but thinks its parent inode "
>  		   "is %"PRIu64".  Fix the descriptor's parent inode?", blkno,
>  		   di->i_blkno, bg->bg_parent_dinode)) {
> -		bg->bg_parent_dinode = di->i_blkno;
> -		changed = 1;
> +			bg->bg_parent_dinode = di->i_blkno;
> +			changed = 1;
> +		}
>  	}
>  
>  	if ((bg->bg_blkno != blkno) &&
> @@ -179,7 +265,7 @@ static errcode_t repair_group_desc(o2fsc
>  
>  	cs->cs_total_bits += bg->bg_bits;
>  	cs->cs_free_bits += bg->bg_free_bits_count;
> -
> +out:
>  	return ret;
>  }
>  
> @@ -474,10 +560,18 @@ static errcode_t check_chain(o2fsck_stat
>  			break;
>  		}
>  
> -		ret = repair_group_desc(ost, di, cs, bg2, blkno);
> +		ret = repair_group_desc(ost, di, cs, bg2, blkno, &clear_ref);
>  		if (ret)
>  			goto out;
>  
> +		/* If we find a duplicate chain, we need to clear it from the
> +		 * current chain.
> +		 *
> +		 * Please note that all the groups below this group will also
> +		 * be removed from this chain.
> +		 */
> +		if (clear_ref)
> +			break;
>  
>  		/* the loop will now start by reading bg1->next_group */
>  		memcpy(buf1, buf2, ost->ost_fs->fs_blocksize);
>   
> ------------------------------------------------------------------------
>
> Index: new.ocfs2-tools/fsck.ocfs2/fsck.c
> ===================================================================
> --- new.ocfs2-tools.orig/fsck.ocfs2/fsck.c	2007-06-06 11:13:33.000000000 -0400
> +++ new.ocfs2-tools/fsck.ocfs2/fsck.c	2007-06-06 14:08:06.000000000 -0400
> @@ -174,6 +174,66 @@ static errcode_t check_superblock(o2fsck
>  	return ret;
>  }
>  
> +/* When we remove slot in tunefs.ocfs2, there may be some panic and
> + * we may empty some journal files, so we have to check whether the
> + * journal file is empty and extend it.
> + */
> +static errcode_t check_journals(o2fsck_state *ost)
> +{
> +	errcode_t ret;
> +	uint64_t blkno;
> +	uint32_t num_clusters = 0;
> +	char *buf = NULL;
> +	struct ocfs2_dinode *di = NULL;
> +	ocfs2_filesys *fs = ost->ost_fs;
> +	char fname[OCFS2_MAX_FILENAME_LEN];
> +	uint16_t i, max_slots = OCFS2_RAW_SB(fs->fs_super)->s_max_slots;
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &buf);
> +	if (ret)
> +		goto out;
> +
> +	for (i = 0; i < max_slots; i++) {
> +		ret = ocfs2_lookup_system_inode(fs, JOURNAL_SYSTEM_INODE, i,
> +						&blkno);
> +		if (ret)
> +			goto out;
> +
> +		ret = ocfs2_read_inode(fs, blkno, buf);
> +		if (ret)
> +			goto out;
> +
> +		di = (struct ocfs2_dinode *)buf;
> +
> +		if (di->i_clusters > 0) {
> +			num_clusters = di->i_clusters;
> +			continue;
> +		}
> +
> +		if (num_clusters == 0) {
> +			/* none of the journal has contents, severe errors. */
> +			ret = OCFS2_ET_JOURNAL_TOO_SMALL;
> +			goto out;
> +		}
> +
> +		sprintf(fname,
> +			ocfs2_system_inodes[JOURNAL_SYSTEM_INODE].si_name, i);
> +		if (!prompt(ost, PY, PR_JOURNAL_FILE_EMPTY,
> +			    "journal file %s is empty, extend it"
> +			    " to %u clusters?", fname, num_clusters))
> +			continue;
> +
> +		ret = ocfs2_make_journal(fs, blkno, num_clusters);
> +		if (ret)
> +			goto out;
> +	}
> +
> +out:
> +	if (buf)
> +		ocfs2_free(&buf);
> +	return ret;
> +}
> +
>  static errcode_t write_out_superblock(o2fsck_state *ost)
>  {
>  	struct ocfs2_dinode *di = ost->ost_fs->fs_super;
> @@ -182,6 +242,10 @@ static errcode_t write_out_superblock(o2
>  	if (sb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_RESIZE_INPROG)
>  		sb->s_feature_incompat &= ~OCFS2_FEATURE_INCOMPAT_RESIZE_INPROG;
>  
> +	if (sb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG)
> +		sb->s_feature_incompat &=
> +				 ~OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG;
> +
>  	if (ost->ost_num_clusters)
>  		di->i_clusters = ost->ost_num_clusters;
>  
> @@ -262,6 +326,9 @@ static int fs_is_clean(o2fsck_state *ost
>  	else if ((OCFS2_RAW_SB(ost->ost_fs->fs_super)->s_feature_incompat &
>  		  OCFS2_FEATURE_INCOMPAT_RESIZE_INPROG))
>  		strcpy(reason, "incomplete volume resize detected");
> +	else if ((OCFS2_RAW_SB(ost->ost_fs->fs_super)->s_feature_incompat &
> +		  OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG))
> +		strcpy(reason, "incomplete slot remove detected");
>  	else if (sb->s_state & OCFS2_ERROR_FS)
>  		strcpy(reason, "contains a file system with errors");
>  	else if (sb->s_max_mnt_count > 0 &&
> @@ -655,6 +722,15 @@ int main(int argc, char **argv)
>  	printf("  max slots:          %u\n\n", 
>  	       OCFS2_RAW_SB(ost->ost_fs->fs_super)->s_max_slots);
>  
> +	if (open_flags & OCFS2_FLAG_RW) {
> +		ret = check_journals(ost);
> +		if (ret) {
> +			printf("fsck saw unrecoverable errors in the journal "
> +				"files and will not continue.\n");
> +			goto unlock;
> +		}
> +	}
> +
>  	ret = maybe_replay_journals(ost, filename, open_flags, blkno, blksize);
>  	if (ret) {
>  		printf("fsck encountered unrecoverable errors while "
>   
> ------------------------------------------------------------------------
>
> Index: new.ocfs2-tools/libocfs2/include/ocfs2.h
> ===================================================================
> --- new.ocfs2-tools.orig/libocfs2/include/ocfs2.h	2007-06-06 11:13:33.000000000 -0400
> +++ new.ocfs2-tools/libocfs2/include/ocfs2.h	2007-06-06 11:20:06.000000000 -0400
> @@ -76,7 +76,8 @@
>  #define OCFS2_LIB_FEATURE_INCOMPAT_SUPP		(OCFS2_FEATURE_INCOMPAT_SUPP | \
>  						 OCFS2_FEATURE_INCOMPAT_HEARTBEAT_DEV | \
>  						 OCFS2_FEATURE_INCOMPAT_RESIZE_INPROG | \
> -						 OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT)
> +						 OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT   | \
> +						 OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG)
>  
>  #define OCFS2_LIB_FEATURE_RO_COMPAT_SUPP	OCFS2_FEATURE_RO_COMPAT_SUPP
>  
> Index: new.ocfs2-tools/libocfs2/include/ocfs2_fs.h
> ===================================================================
> --- new.ocfs2-tools.orig/libocfs2/include/ocfs2_fs.h	2007-06-06 11:13:33.000000000 -0400
> +++ new.ocfs2-tools/libocfs2/include/ocfs2_fs.h	2007-06-06 11:20:06.000000000 -0400
> @@ -109,6 +109,12 @@
>  /* Support for sparse allocation in b-trees */
>  #define OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC	0x0010
>  
> +/* tunefs sets this incompat flag before starting slot remove and clears it
> + * at the end. This flag protects users from inadvertently mounting the fs
> + * after an aborted run without fsck-ing.
> + */
> +#define OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG	0x0020
> +
>  /*
>   * backup superblock flag is used to indicate that this volume
>   * has backup superblocks.
>   
> ------------------------------------------------------------------------
>
> Index: new.ocfs2-tools/fsck.ocfs2/pass4.c
> ===================================================================
> --- new.ocfs2-tools.orig/fsck.ocfs2/pass4.c	2007-06-06 11:13:33.000000000 -0400
> +++ new.ocfs2-tools/fsck.ocfs2/pass4.c	2007-06-06 11:20:06.000000000 -0400
> @@ -152,6 +152,46 @@ out:
>  	return ret_flags;
>  }
>  
> +static errcode_t create_orphan_dir(o2fsck_state *ost, char *fname)
> +{
> +	errcode_t ret;
> +	uint64_t blkno;
> +	ocfs2_filesys *fs = ost->ost_fs;
> +
> +	/* create inode for system file */
> +	ret = ocfs2_new_system_inode(fs, &blkno,
> +			ocfs2_system_inodes[ORPHAN_DIR_SYSTEM_INODE].si_mode,
> +			ocfs2_system_inodes[ORPHAN_DIR_SYSTEM_INODE].si_iflags);
> +	if (ret)
> +		goto bail;
> +
> +	ret = ocfs2_expand_dir(fs, blkno, fs->fs_sysdir_blkno);
> +	if (ret)
> +		goto bail;
> +
> +	/* Add the inode to the system dir */
> +	ret = ocfs2_link(fs, fs->fs_sysdir_blkno, fname, blkno,
> +			 OCFS2_FT_DIR);
> +	if (ret == OCFS2_ET_DIR_NO_SPACE) {
> +		ret = ocfs2_expand_dir(fs, fs->fs_sysdir_blkno,
> +				       fs->fs_sysdir_blkno);
> +		if (!ret)
> +			ret = ocfs2_link(fs, fs->fs_sysdir_blkno,
> +					 fname, blkno, OCFS2_FT_DIR);
> +	}
> +
> +	if (ret)
> +		goto bail;
> +
> +	/* we have created an orphan dir under system dir and updated the disk,
> +	 * so we have to update the refs in ost accordingly.
> +	 */
> +	o2fsck_icount_delta(ost->ost_icount_refs, fs->fs_sysdir_blkno, 1);
> +	o2fsck_icount_delta(ost->ost_icount_in_inodes, fs->fs_sysdir_blkno, 1);
> +bail:
> +	return ret;
> +}
> +
>  static errcode_t replay_orphan_dir(o2fsck_state *ost)
>  {
>  	errcode_t ret = OCFS2_ET_CORRUPT_SUPERBLOCK;
> @@ -171,8 +211,25 @@ static errcode_t replay_orphan_dir(o2fsc
>  
>  		ret = ocfs2_lookup(ost->ost_fs, ost->ost_fs->fs_sysdir_blkno,
>  				   name, bytes, NULL, &ino);
> -		if (ret)
> -			goto out;
> +		if (ret) {
> +			if (ret != OCFS2_ET_FILE_NOT_FOUND)
> +				goto out;
> +
> +			/* orphan dir is missing, it may be caused by an
> +			 * unsuccessful removing slots in tunefs.ocfs2.
> +			 * so create it.
> +			 */
> +	   		if (prompt(ost, PY, PR_ORPHAN_DIR_MISSING,
> +				   "%s is missing in system directory. "
> +				   "Create it?", name)) {
> +				ret = create_orphan_dir(ost, name);
> +				if (ret) {
> +					com_err(whoami, ret, "while creating"
> +						"orphan directory %s", name);
> +					continue;
> +				}
> +			}
> +		}
>  
>  		ret = ocfs2_dir_iterate(ost->ost_fs, ino,
>  					OCFS2_DIRENT_FLAG_EXCLUDE_DOTS, NULL,
>   
> ------------------------------------------------------------------------
>
> Index: new.ocfs2-tools/tunefs.ocfs2/Makefile
> ===================================================================
> --- new.ocfs2-tools.orig/tunefs.ocfs2/Makefile	2007-06-06 11:13:36.000000000 -0400
> +++ new.ocfs2-tools/tunefs.ocfs2/Makefile	2007-06-06 11:20:06.000000000 -0400
> @@ -30,7 +30,7 @@ DEFINES = -DOCFS2_FLAT_INCLUDES -DVERSIO
>  
>  MANS = tunefs.ocfs2.8
>  
> -CFILES = tunefs.c query.c
> +CFILES = tunefs.c query.c remove_slot.c
>  HFILES = tunefs.h
>  
>  OBJS = $(subst .c,.o,$(CFILES))
> Index: new.ocfs2-tools/tunefs.ocfs2/remove_slot.c
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ new.ocfs2-tools/tunefs.ocfs2/remove_slot.c	2007-06-06 13:45:20.000000000 -0400
> @@ -0,0 +1,716 @@
> +/*
> + * remove_slot.c
> + *
> + * The function for removing slots from ocfs2 volume.
> + *
> + * Copyright (C) 2007 Oracle.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; if not, write to the
> + * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
> + * Boston, MA 021110-1307, USA.
> + *
> + */
> +
> +#include <inttypes.h>
> +#include <bitops.h>
> +#include <ocfs2.h>
> +
> +#include <assert.h>
> +
> +#include "tunefs.h"
> +
> +extern ocfs2_tune_opts opts;
> +
> +enum relink_alloc_action {
> +	RELINK_EXTENT_ALLOC = 1,
> +	RELINK_INODE_ALLOC
> +};
> +
> +struct relink_ctxt {
> +	enum relink_alloc_action action;
> +	struct ocfs2_chain_rec *cr;
> +	uint16_t new_slot;
> +	uint64_t new_inode;
> +	char *gd_buf;
> +	char *start_gd_buf;
> +	char *src_inode;
> +	char *dst_inode;
> +	char *ex_buf;
> +};
> +
> +struct remove_slot_ctxt {
> +	ocfs2_filesys *fs;
> +	uint16_t removed_slot;
> +	errcode_t errcode;
> +};
> +
> +static errcode_t change_sub_alloc_slot(ocfs2_filesys *fs,
> +				       uint64_t blkno,
> +				       struct relink_ctxt *ctxt)
> +{
> +	errcode_t ret;
> +	struct ocfs2_dinode *di = NULL;
> +	struct ocfs2_extent_block *eb = NULL;
> +
> +	if (ctxt->action == RELINK_EXTENT_ALLOC) {
> +		/* change sub alloc slot in the extent block. */
> +		ret = ocfs2_read_extent_block(fs, blkno, ctxt->ex_buf);
> +		if (ret)
> +			goto bail;
> +
> +		eb = (struct ocfs2_extent_block *)ctxt->ex_buf;
> +		eb->h_suballoc_slot = ctxt->new_slot;
> +
> +		ret = ocfs2_write_extent_block(fs, blkno, ctxt->ex_buf);
> +		if (ret)
> +			goto bail;
> +	} else {
> +		/* change sub alloc slot in the inode. */
> +		ret = ocfs2_read_inode(fs, blkno, ctxt->ex_buf);
> +		if (ret)
> +			goto bail;
> +
> +		di = (struct ocfs2_dinode *)ctxt->ex_buf;
> +		di->i_suballoc_slot = ctxt->new_slot;
> +
> +		ret = ocfs2_write_inode(fs, blkno, ctxt->ex_buf);
> +		if (ret)
> +			goto bail;
> +	}
> +bail:
> +	return ret;
> +}
> +
> +/*
> + * Move the chain record to the new alloc file.
> + * The chain's start group is stored in ctxt->start_gd_buf and
> + * tail group is ctxt->gd_buf. They are set by move_chain_rec.
> + *
> + * Note:
> + * currently we link all the group descriptor in the chain "0" of the
> + * new alloc file.
> + */
> +static errcode_t move_groups(ocfs2_filesys *fs,
> +			     struct relink_ctxt *ctxt)
> +{
> +	errcode_t ret;
> +	struct ocfs2_group_desc *gd_start = NULL, *gd_end = NULL;
> +	struct ocfs2_dinode *di = NULL;
> +	struct ocfs2_chain_list *cl = NULL;
> +	struct ocfs2_chain_rec *cr = NULL;
> +	uint32_t clusters;
> +
> +	ret = ocfs2_read_inode(fs, ctxt->new_inode, ctxt->dst_inode);
> +	if (ret)
> +		goto bail;
> +
> +	di = (struct ocfs2_dinode *)ctxt->dst_inode;
> +	cl = &di->id2.i_chain;
> +	cr = &cl->cl_recs[0];
> +
> +	gd_start = (struct ocfs2_group_desc *)ctxt->start_gd_buf;
> +	gd_end = (struct ocfs2_group_desc *)ctxt->gd_buf;
> +
> +	assert (gd_end->bg_next_group == 0);
> +
> +	/* link the original chain to the group list's end. */
> +	gd_end->bg_next_group = cr->c_blkno;
> +	ret = ocfs2_write_group_desc(fs, gd_end->bg_blkno, ctxt->gd_buf);
> +	if (ret)
> +		goto bail;
> +
> +	/* link the group list's head in the chain.
> +	 * modify the chain record and the new files simultaneously.
> +	 */
> +	cr->c_blkno = gd_start->bg_blkno;
> +	cr->c_total += ctxt->cr->c_total;
> +	cr->c_free += ctxt->cr->c_free;
> +
> +	/* If the chain is empty, increase the free_rec. */
> +	if (cl->cl_next_free_rec == 0)
> +		cl->cl_next_free_rec = 1;
> +
> +	di->id1.bitmap1.i_total += ctxt->cr->c_total;
> +	di->id1.bitmap1.i_used += ctxt->cr->c_total;
> +	di->id1.bitmap1.i_used -= ctxt->cr->c_free;
> +	clusters = ctxt->cr->c_total / cl->cl_bpc;
> +	di->i_clusters += clusters;
> +	di->i_size += clusters * fs->fs_clustersize;
> +
> +	ret = ocfs2_write_inode(fs, ctxt->new_inode, ctxt->dst_inode);
> +	if (ret)
> +		goto bail;
> +
> +bail:
> +	return ret;
> +}
> +
> +/*
> + * This function will iterate the chain_rec and do the following modifications:
> + * 1. modify  Sub Alloc Slot in extent block/inodes accordingly.
> + * 2. change the group parent and bg_chain according to its future owner.
> + * 3. link the chain to the new slot files.
> + */
> +static errcode_t move_chain_rec(ocfs2_filesys *fs, struct relink_ctxt *ctxt)
> +{
> +	errcode_t ret = 0;
> +	int i, start, end = 1;
> +	uint64_t blkno, gd_blkno = ctxt->cr->c_blkno;
> +	struct ocfs2_group_desc *gd = NULL;
> +
> +	if (gd_blkno == 0)
> +		goto bail;
> +
> +	memset(ctxt->start_gd_buf, 0, fs->fs_blocksize);
> +
> +	while (gd_blkno) {
> +		ret = ocfs2_read_group_desc(fs, gd_blkno, ctxt->gd_buf);
> +		if (ret)
> +			return OCFS2_CHAIN_ERROR;
> +
> +		/* record the first group descriptor. */
> +		if (ctxt->start_gd_buf[0] == 0)
> +			memcpy(ctxt->start_gd_buf, ctxt->gd_buf,
> +			       fs->fs_blocksize);
> +
> +		gd = (struct ocfs2_group_desc *)ctxt->gd_buf;
> +
> +		end = 1;
> +		/* Modify the "Sub Alloc Slot" in the extent block/inodes. */
> +		while (end < gd->bg_bits) {
> +			start = ocfs2_find_next_bit_set(gd->bg_bitmap,
> +							gd->bg_bits, end);
> +			if (start >= gd->bg_bits)
> +				break;
> +
> +			end = ocfs2_find_next_bit_clear(gd->bg_bitmap,
> +							gd->bg_bits, start);
> +
> +			for (i = start; i < end; i++) {
> +				blkno = gd_blkno + i;
> +				ret = change_sub_alloc_slot(fs, blkno, ctxt);
> +				if (ret)
> +					goto bail;
> +			}
> +		}
> +
> +		gd->bg_chain = 0;
> +		gd->bg_parent_dinode = ctxt->new_inode;
> +
> +		ret = ocfs2_write_group_desc(fs, gd_blkno, ctxt->gd_buf);
> +		if (ret)
> +			goto bail;
> +
> +		gd_blkno = gd->bg_next_group;
> +	}
> +
> +	/* move the chain to the new slot file.
> +	 * now the start_gd_buf contains the first group descriptor in the
> +	 * chain and gd_buf contains the last group descriptor in the chain.
> +	 */
> +	ret = move_groups(fs, ctxt);
> +
> +bail:
> +	return ret;
> +}
> +
> +static errcode_t relink_system_alloc(ocfs2_filesys *fs,
> +				     uint16_t removed_slot,
> +				     enum relink_alloc_action action)
> +{
> +	errcode_t ret;
> +	uint16_t i;
> +	uint64_t blkno;
> +	struct ocfs2_dinode *di = NULL;
> +	struct ocfs2_chain_list *cl = NULL;
> +	struct relink_ctxt ctxt;
> +
> +	memset(&ctxt, 0, sizeof(ctxt));
> +
> +	if (action == RELINK_EXTENT_ALLOC)
> +		ret = ocfs2_lookup_system_inode(fs,
> +						EXTENT_ALLOC_SYSTEM_INODE,
> +						removed_slot, &blkno);
> +	else
> +		ret = ocfs2_lookup_system_inode(fs,
> +						INODE_ALLOC_SYSTEM_INODE,
> +						removed_slot, &blkno);
> +	if (ret)
> +		goto bail;
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &ctxt.src_inode);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while allocating a block "
> +			"during relinking system alloc");
> +		goto bail;
> +	}
> +
> +	ret = ocfs2_read_inode(fs, blkno, ctxt.src_inode);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while reading inode "
> +			"%"PRIu64" during relinking system alloc", blkno);
> +		goto bail;
> +	}
> +
> +	di = (struct ocfs2_dinode *)ctxt.src_inode;
> +
> +	if (!(di->i_flags & OCFS2_VALID_FL) ||
> +	    !(di->i_flags & OCFS2_BITMAP_FL) ||
> +	    !(di->i_flags & OCFS2_CHAIN_FL)) {
> +		com_err(opts.progname, 0, "system  alloc %"PRIu64" corrupts."
> +			"during relinking system alloc", blkno);
> +		goto bail;
> +	}
> +
> +	if (di->id1.bitmap1.i_total == 0)
> +		goto bail;
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &ctxt.gd_buf);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while allocating a block "
> +			"during relinking system alloc");
> +		goto bail;
> +	}
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &ctxt.ex_buf);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while allocating a block "
> +			"during relinking system alloc");
> +		goto bail;
> +	}
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &ctxt.dst_inode);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while allocating a block "
> +			"during relinking system alloc");
> +		goto bail;
> +	}
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &ctxt.start_gd_buf);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while allocating a block "
> +			"during relinking system alloc");
> +		goto bail;
> +	}
> +
> +	cl = &di->id2.i_chain;
> +	ctxt.action = action;
> +
> +	/* Iterate all the chain record and move them to the new slots.
> +	 * In order to balance the chain to the reserved slots, we divide the
> +	 * chains among all the slots.
> +	 */
> +	for (i = 0; i < cl->cl_next_free_rec; i ++) {
> +		ctxt.new_slot = i % opts.num_slots;
> +		if (ctxt.action == RELINK_EXTENT_ALLOC)
> +			ret = ocfs2_lookup_system_inode(fs,
> +						EXTENT_ALLOC_SYSTEM_INODE,
> +						ctxt.new_slot,
> +						&ctxt.new_inode);
> +		else
> +			ret = ocfs2_lookup_system_inode(fs,
> +						INODE_ALLOC_SYSTEM_INODE,
> +						ctxt.new_slot,
> +						&ctxt.new_inode);
> +		if (ret)
> +			goto bail;
> +
> +		ctxt.cr = &cl->cl_recs[i];
> +
> +		ret = move_chain_rec(fs, &ctxt);
> +		if (ret) {
> +			com_err(opts.progname, ret,
> +				"while iterating system alloc file");
> +			goto bail;
> +		}
> +	}
> +
> +	/* emtpy the original alloc files. */
> +	di->id1.bitmap1.i_used = 0;
> +	di->id1.bitmap1.i_total = 0;
> +	di->i_clusters = 0;
> +	di->i_size = 0;
> +
> +	cl = &di->id2.i_chain;
> +	cl->cl_next_free_rec = 0;
> +	memset(cl->cl_recs, 0, sizeof(struct ocfs2_chain_rec) * cl->cl_count);
> +
> +	ret = ocfs2_write_inode(fs, blkno, ctxt.src_inode);
> +
> +bail:
> +	if (ctxt.gd_buf)
> +		ocfs2_free(&ctxt.gd_buf);
> +	if (ctxt.ex_buf)
> +		ocfs2_free(&ctxt.ex_buf);
> +	if (ctxt.dst_inode)
> +		ocfs2_free(&ctxt.dst_inode);
> +	if (ctxt.ex_buf)
> +		ocfs2_free(&ctxt.start_gd_buf);
> +	if (ctxt.src_inode)
> +		ocfs2_free(&ctxt.src_inode);
> +
> +	return ret;
> +}
> +
> +static errcode_t truncate_journal_orphan_dir(ocfs2_filesys *fs,
> +					     uint16_t removed_slot)
> +{
> +	errcode_t ret;
> +	uint64_t blkno;
> +
> +	/* Truncate orphan dir. */
> +	ret = ocfs2_lookup_system_inode(fs, ORPHAN_DIR_SYSTEM_INODE,
> +					removed_slot, &blkno);
> +	if (ret)
> +		goto bail;
> +
> +	ret = ocfs2_truncate(fs, blkno, 0);
> +	if (ret)
> +		goto bail;
> +
> +	/* Truncate the journal file. */
> +	ret = ocfs2_lookup_system_inode(fs, JOURNAL_SYSTEM_INODE,
> +					removed_slot, &blkno);
> +	if (ret)
> +		goto bail;
> +
> +	ret = ocfs2_truncate(fs, blkno, 0);
> +
> +bail:
> +	return ret;
> +}
> +
> +static int remove_slot_iterate(struct ocfs2_dir_entry *dirent, int offset,
> +			       int blocksize, char *buf, void *priv_data)
> +{
> +	struct remove_slot_ctxt *ctxt = (struct remove_slot_ctxt *)priv_data;
> +	int ret_flags = 0;
> +	errcode_t ret;
> +	char fname[SYSTEM_FILE_NAME_MAX];
> +
> +	sprintf(fname, "%04d", ctxt->removed_slot);
> +
> +	if (strstr(dirent->name, fname)) {
> +
> +		ret = ocfs2_delete_inode(ctxt->fs, dirent->inode);
> +		if (ret) {
> +			ret_flags |= OCFS2_DIRENT_ERROR;
> +			ctxt->errcode = ret;
> +			goto out;
> +		}
> +
> +		dirent->inode = 0;
> +		ret_flags |= OCFS2_DIRENT_CHANGED;
> +	}
> +
> +out:
> +	return ret_flags;
> +}
> +
> +/* Remove all the system files which belong to the removed slot. */
> +static errcode_t remove_slot_entry(ocfs2_filesys *fs, uint16_t removed_slot)
> +{
> +	struct remove_slot_ctxt ctxt = {
> +		.fs = fs,
> +		.removed_slot = removed_slot,
> +		.errcode = 0
> +	};
> +
> +	ocfs2_dir_iterate(fs, fs->fs_sysdir_blkno,
> +			  OCFS2_DIRENT_FLAG_EXCLUDE_DOTS, NULL,
> +			  remove_slot_iterate, &ctxt);
> +
> +	return ctxt.errcode;
> +}
> +
> +/* Decrease the i_links_count of the inode "blkno". */
> +static errcode_t decrease_link_count(ocfs2_filesys *fs, uint16_t blkno)
> +{
> +	errcode_t ret;
> +	char *buf = NULL;
> +	struct ocfs2_dinode *di  = NULL;
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &buf);
> +	if (ret)
> +		goto bail;
> +
> +	ret = ocfs2_read_inode(fs, blkno, buf);
> +	if (ret)
> +		goto bail;
> +
> +	di = (struct ocfs2_dinode *)buf;
> +
> +	if (di->i_links_count > 0)
> +		di->i_links_count--;
> +	else {
> +		ret = OCFS2_ET_INODE_NOT_VALID;
> +		goto bail;
> +	}
> +
> +	ret = ocfs2_write_inode(fs, blkno, buf);
> +bail:
> +	if (buf)
> +		ocfs2_free(&buf);
> +	return ret;
> +}
> +
> +errcode_t remove_slots(ocfs2_filesys *fs)
> +{
> +	errcode_t ret = 0;
> +	uint16_t old_num = OCFS2_RAW_SB(fs->fs_super)->s_max_slots;
> +	uint16_t removed_slot = old_num - 1;
> +
> +	/* we will remove the slots once at a time so that fsck.ocfs2 can work
> +	 * well and we can continue our work easily in case of any panic.
> +	 */
> +	while (removed_slot >= opts.num_slots) {
> +		/* Link the specified extent alloc file to others. */
> +		ret = relink_system_alloc(fs, removed_slot,
> +					  RELINK_EXTENT_ALLOC);
> +		if (ret)
> +			goto bail;
> +
> +		/* Link the specified inode alloc file to others. */
> +		ret = relink_system_alloc(fs, removed_slot,
> +					  RELINK_INODE_ALLOC);
> +		if (ret)
> +			goto bail;
> +
> +		/* Truncate the journal and orphan dir to release their
> +		 * clusters to the global bitmap.
> +		 */
> +		ret = truncate_journal_orphan_dir(fs, removed_slot);
> +		if (ret)
> +			goto bail;
> +
> +		/* Now, we decrease the max_slots first and then remove the
> +		 * slots for the reason that:
> +		 *
> +		 * 1. ocfs2_lock_down_clusters needs to lock all the journal
> +		 * files. so if we delete the journal entry first and fail
> +		 * to decrease the max_slots, the whole cluster can't be
> +		 * locked any more due to the loss of journals.
> +		 *
> +		 * 2. Now all the resources except the inodes are freed
> +		 * so it is safe to decrease the slots first, and if any
> +		 * panic happens after we decrease the slots, we can ignore
> +		 * them, and actually if we want to increase the slot in the
> +		 * future, we can reuse these inodes.
> +		 */
> +
> +		/* The slot number is updated in the super block.*/
> +		OCFS2_RAW_SB(fs->fs_super)->s_max_slots--;
> +		ret = ocfs2_write_super(fs);
> +		if (ret)
> +			goto bail;
> +
> +		/* The extra system dir entries should be removed. */
> +		ret = remove_slot_entry(fs, removed_slot);
> +		if (ret)
> +			goto bail;
> +
> +		/* Decrease the i_links_count in system file directory
> +		 * since the orphan_dir is removed.
> +		 */
> +		ret = decrease_link_count(fs, fs->fs_sysdir_blkno);
> +		if (ret)
> +			goto bail;
> +
> +		removed_slot--;
> +	}
> +
> +bail:
> +	return ret;
> +}
> +
> +static int orphan_iterate(struct ocfs2_dir_entry *dirent, int offset,
> +			  int blocksize, char *buf, void *priv_data)
> +{
> +	int *has_orphan = (int *)priv_data;
> +
> +	*has_orphan = 1;
> +
> +	/* we have found some file/dir in the orphan_dir,
> +	 * so there is no need to go on the iteration.
> +	 */
> +	return OCFS2_DIRENT_ABORT;
> +}
> +
> +static errcode_t orphan_dir_check(ocfs2_filesys *fs,
> +				  int *has_orphan)
> +{
> +	errcode_t ret = 0;
> +	uint64_t blkno;
> +	int i;
> +	uint16_t max_slots = OCFS2_RAW_SB(fs->fs_super)->s_max_slots;
> +
> +	for (i = opts.num_slots ; i < max_slots; ++i) {
> +		ret = ocfs2_lookup_system_inode(fs, ORPHAN_DIR_SYSTEM_INODE,
> +						i, &blkno);
> +		if (ret) {
> +			com_err(opts.progname, ret, "while looking up "
> +				"orphan dir for slot %u during orphan dir "
> +				"check", i);
> +			goto bail;
> +		}
> +
> +		ret = ocfs2_dir_iterate(fs, blkno,
> +					OCFS2_DIRENT_FLAG_EXCLUDE_DOTS, NULL,
> +					orphan_iterate, has_orphan);
> +		if (ret || *has_orphan) {
> +			com_err(opts.progname, 0, "orphan dir for slot %u "
> +				"has entries", i);
> +			goto bail;
> +		}
> +	}
> +
> +bail:
> +	return ret;
> +}
> +
> +static errcode_t local_alloc_check(ocfs2_filesys *fs,
> +				  int *has_local_alloc)
> +{
> +	errcode_t ret = 0;
> +	uint16_t i;
> +	uint64_t blkno;
> +	char *buf = NULL;
> +	struct ocfs2_dinode *di = NULL;
> +	uint16_t max_slots = OCFS2_RAW_SB(fs->fs_super)->s_max_slots;
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &buf);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while allocating a block "
> +			"during local alloc check");
> +		goto bail;
> +	}
> +
> +	for (i = opts.num_slots ; i < max_slots; ++i) {
> +		ret = ocfs2_lookup_system_inode(fs, LOCAL_ALLOC_SYSTEM_INODE,
> +						i, &blkno);
> +		if (ret) {
> +			com_err(opts.progname, ret, "while looking up "
> +				"local alloc for slot %u during local alloc "
> +				"check", i);
> +			goto bail;
> +		}
> +
> +		ret = ocfs2_read_inode(fs, blkno, buf);
> +		if (ret) {
> +			com_err(opts.progname, ret, "while reading inode "
> +				"%"PRIu64" during local alloc check", blkno);
> +			goto bail;
> +		}
> +
> +		di = (struct ocfs2_dinode *)buf;
> +
> +		if (di->id1.bitmap1.i_total > 0) {
> +			*has_local_alloc = 1;
> +			com_err(opts.progname, 0, "local alloc for slot %u "
> +				"isn't empty", i);
> +			goto bail;
> +		}
> +	}
> +
> +bail:
> +	if (buf)
> +		ocfs2_free(&buf);
> +	return ret;
> +}
> +
> +static errcode_t truncate_log_check(ocfs2_filesys *fs,
> +				    int *has_truncate_log)
> +{
> +	errcode_t ret = 0;
> +	uint16_t i;
> +	uint64_t blkno;
> +	char *buf = NULL;
> +	struct ocfs2_dinode *di = NULL;
> +	uint16_t max_slots = OCFS2_RAW_SB(fs->fs_super)->s_max_slots;
> +
> +	ret = ocfs2_malloc_block(fs->fs_io, &buf);
> +	if (ret) {
> +		com_err(opts.progname, ret, "while allocating a block "
> +			"during truncate log check");
> +		goto bail;
> +	}
> +
> +	for (i = opts.num_slots ; i < max_slots; ++i) {
> +		ret = ocfs2_lookup_system_inode(fs, TRUNCATE_LOG_SYSTEM_INODE,
> +						i, &blkno);
> +		if (ret) {
> +			com_err(opts.progname, ret, "while looking up "
> +				"truncate log for slot %u during truncate log "
> +				"check", i);
> +			goto bail;
> +		}
> +
> +		ret = ocfs2_read_inode(fs, blkno, buf);
> +		if (ret) {
> +			com_err(opts.progname, ret, "while reading inode "
> +				"%"PRIu64" during truncate log check", blkno);
> +			goto bail;
> +		}
> +
> +		di = (struct ocfs2_dinode *)buf;
> +
> +		if (di->id2.i_dealloc.tl_used > 0) {
> +			*has_truncate_log = 1;
> +			com_err(opts.progname, 0, "truncate log for slot %u "
> +				"isn't empty", i);
> +			goto bail;
> +		}
> +	}
> +
> +bail:
> +	if (buf)
> +		ocfs2_free(&buf);
> +	return ret;
> +}
> +
> +errcode_t remove_slot_check(ocfs2_filesys *fs)
> +{
> +	errcode_t ret;
> +	int has_orphan = 0, has_truncate_log = 0, has_local_alloc = 0;
> +
> +	/* we don't allow remove_slot to coexist with other tunefs
> +	 * options to keep things simple.
> +	 */
> +	if (opts.backup_super ||opts.vol_label ||
> +	     opts.mount || opts.jrnl_size || opts.num_blocks) {
> +		com_err(opts.progname, 0, "Cannot remove slot"
> +			" along with other tasks");
> +		exit(1);
> +	}
> +
> +	ret = orphan_dir_check(fs, &has_orphan);
> +	if (ret || has_orphan) {
> +		ret = 1;
> +		goto bail;
> +	}
> +
> +	ret = local_alloc_check(fs, &has_local_alloc);
> +	if (ret || has_local_alloc) {
> +		ret = 1;
> +		goto bail;
> +	}
> +
> +	ret = truncate_log_check(fs, &has_truncate_log);
> +	if (ret || has_truncate_log) {
> +		ret = 1;
> +		goto bail;
> +	}
> +bail:
> +	return ret;
> +}
> Index: new.ocfs2-tools/tunefs.ocfs2/tunefs.h
> ===================================================================
> --- new.ocfs2-tools.orig/tunefs.ocfs2/tunefs.h	2007-06-06 11:13:36.000000000 -0400
> +++ new.ocfs2-tools/tunefs.ocfs2/tunefs.h	2007-06-06 11:20:06.000000000 -0400
> @@ -92,3 +92,5 @@ typedef struct _ocfs2_tune_opts {
>  
>  void print_query(char *queryfmt);
>  
> +errcode_t remove_slots(ocfs2_filesys *fs);
> +errcode_t remove_slot_check(ocfs2_filesys *fs);
>   
> ------------------------------------------------------------------------
>
> ocfs2_head_1.patch 
> debugfs_1.patch 
> remove_slot_1.patch 
> tunefs_1.patch 
> group_check.patch 
> orphan_check.patch 
> journal_check.patch 
>   
> ------------------------------------------------------------------------
>
> Index: new.ocfs2-tools/tunefs.ocfs2/tunefs.c
> ===================================================================
> --- new.ocfs2-tools.orig/tunefs.ocfs2/tunefs.c	2007-06-06 11:13:36.000000000 -0400
> +++ new.ocfs2-tools/tunefs.ocfs2/tunefs.c	2007-06-06 11:20:06.000000000 -0400
> @@ -863,7 +863,10 @@ static errcode_t update_slots(ocfs2_file
>  	errcode_t ret = 0;
>  
>  	block_signals(SIG_BLOCK);
> -	ret = add_slots(fs);
> +	if (opts.num_slots > OCFS2_RAW_SB(fs->fs_super)->s_max_slots)
> +		ret = add_slots(fs);
> +	else
> +		ret = remove_slots(fs);
>  	block_signals(SIG_UNBLOCK);
>  	if (ret)
>  		return ret;
> @@ -1253,7 +1256,7 @@ int main(int argc, char **argv)
>  	int upd_incompat = 0;
>  	int upd_backup_super = 0;
>  	char *tmpstr;
> -	uint16_t tmp;
> +	uint16_t max_slots;
>  	uint64_t def_jrnl_size = 0;
>  	uint64_t num_clusters;
>  	int dirty = 0;
> @@ -1377,15 +1380,23 @@ int main(int argc, char **argv)
>  	}
>  
>  	/* validate num slots */
> +	max_slots = OCFS2_RAW_SB(fs->fs_super)->s_max_slots;
>  	if (opts.num_slots) {
> -		tmp = OCFS2_RAW_SB(fs->fs_super)->s_max_slots;
> -		if (opts.num_slots > tmp) {
> +		if (opts.num_slots < max_slots) {
> +			ret = remove_slot_check(fs);
> +			if (ret) {
> +				com_err(opts.progname, 0,
> +					"remove slot check failed. ");
> +				goto unlock;
> +			}
> +		}
> +		if (opts.num_slots != max_slots) {
>  			printf("Changing number of node slots from %d to %d\n",
> -			       tmp, opts.num_slots);
> +			       max_slots, opts.num_slots);
>  		} else {
>  			com_err(opts.progname, 0, "Node slots (%d) has to be "
> -				"more than the configured node slots (%d)",
> -			       opts.num_slots, tmp);
> +				"different from the configured node slots",
> +			       opts.num_slots);
>  			goto unlock;
>  		}
>  
> @@ -1445,6 +1456,19 @@ int main(int argc, char **argv)
>  		upd_incompat = 1;
>  	}
>  
> +	/* Set remove slots incompat flag on superblock */
> +	if (opts.num_slots < max_slots) {
> +		OCFS2_RAW_SB(fs->fs_super)->s_feature_incompat |=
> +			OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG;
> +		ret = ocfs2_write_super(fs);
> +		if (ret) {
> +			com_err(opts.progname, ret,
> +				"while writing remove slot incompat flag");
> +			goto unlock;
> +		}
> +		upd_incompat = 1;
> +	}
> +
>  	/* update volume label */
>  	if (opts.vol_label) {
>  		update_volume_label(fs, &upd_label);
> @@ -1467,8 +1491,12 @@ int main(int argc, char **argv)
>  				"while updating node slots");
>  			goto unlock;
>  		}
> +		/* Clear remove slot incompat flag on superblock */
> +		if (opts.num_slots < max_slots)
> +			OCFS2_RAW_SB(fs->fs_super)->s_feature_incompat &=
> +				~OCFS2_FEATURE_INCOMPAT_REMOVE_SLOT_INPROG;
>  		if (upd_slots)
> -			printf("Added node slots\n");
> +			printf("Changed node slots\n");
>  	}
>  
>  	/* change mount type */
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel




More information about the Ocfs2-tools-devel mailing list