[Ocfs2-test-devel] [PATCH 2/9] ocfs2-test: buildkernel - Converted from LAM/MPI to OpenMPI.
tristan.ye
tristan.ye at oracle.com
Tue Feb 17 17:54:37 PST 2009
On Tue, 2009-02-17 at 14:38 -0800, Marcos Matsunaga wrote:
> Signed-off-by: Marcos Matsunaga <Marcos.Matsunaga at oracle.com>
> ---
> programs/buildkernel/run_buildkernel.py | 31 +++++++++++++++++++++----------
> 1 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/programs/buildkernel/run_buildkernel.py b/programs/buildkernel/run_buildkernel.py
> index ba73405..351f1e1 100755
> --- a/programs/buildkernel/run_buildkernel.py
> +++ b/programs/buildkernel/run_buildkernel.py
> @@ -84,22 +84,26 @@ def Initialize():
> 'Initialize the directories (remove and extract)'
> #
> o2tf.printlog('Cleaning up directories.', logfile, 0, '')
> - o2tf.StartMPI(DEBUGON, options.nodelist, logfile)
> - o2tf.lamexec(DEBUGON, nproc, config.WAIT, str('%s -c -d %s -l %s' % \
> + o2tf.OpenMPIInit(DEBUGON, options.nodelist, logfile, 'ssh')
> + o2tf.openmpi_run(DEBUGON, nproc, str('%s -c -d %s -l %s' % \
> (buildcmd,
> options.dirlist,
> options.logfile) ),
> options.nodelist,
> - options.logfile )
> + 'ssh',
> + options.logfile,
> + 'WAIT')
> #
> o2tf.printlog('Extracting tar file into directories.', logfile, 0, '')
> - o2tf.lamexec(DEBUGON, nproc, config.WAIT, str('%s -e -d %s -l %s -t %s' % \
> + o2tf.openmpi_run(DEBUGON, nproc, str('%s -e -d %s -l %s -t %s' % \
> (buildcmd,
> options.dirlist,
> options.logfile,
> tarfile) ),
> options.nodelist,
> - options.logfile )
> + 'ssh',
> + options.logfile,
> + 'WAIT')
> o2tf.printlog('Directories initialization completed.', logfile, 0, '')
> #
> Usage = 'Usage: %prog [-c|--count count] \
> @@ -205,13 +209,20 @@ for i in range(options.count):
> r = i+1
> o2tf.printlog('run_buildkernel: Starting RUN# %s of %s' % (r, options.count),
> logfile, 3, '=')
> - o2tf.StartMPI(DEBUGON, options.nodelist, logfile)
> - o2tf.lamexec(DEBUGON, nproc, config.WAIT, str('%s -d %s -l %s -n %s' % \
> + o2tf.OpenMPIInit(DEBUGON, options.nodelist, logfile, 'ssh')
> + ret = o2tf.openmpi_run(DEBUGON, nproc, str('%s -d %s -l %s -n %s' % \
> (buildcmd,
> options.dirlist,
> options.logfile,
> options.nodelist) ),
> options.nodelist,
> - options.logfile )
> -o2tf.printlog('run_buildkernel: Test completed successfully.',
> - logfile, 3, '=')
> + 'ssh',
> + options.logfile,
> + 'WAIT' )
> +if not ret:
> + o2tf.printlog('run_buildkernel: main - execution successful.',
> + logfile, 0, '')
> +else:
> + o2tf.printlog('run_buildkernel: main - execution failed.',
> + logfile, 0, '')
> +sys.exit(ret)
Sometimes, we need such python launchers above to be invoked in shell
script(e.g mutiple_run.sh), in that case, if the 'ret' we upcasting in
buildkernel.py here is 256, then we unfortunately will get a ZERO return
code instead of 256 in any shell caller. just a wild guess, the shell
may treat the return code($?) as one byte, so it accepts the value in
range [0-255].
Actually, all python launchers which use os.spawn() to run openmpi
binary are quite subject to get a 256 return code when the mpirun
failed, and that will really fool our bash caller.
Therefore, I suggest you not directly upcast the return code by
'sys.exit(ret)', you can simply do as follows instead.
if not ret:
o2tf.printlog('run_buildkernel: main - execution successful.',logfile,
0, '')
else:
sys.exit(1) #it's enough to mark failure here.
All other following patches also need to concern such problem.
More information about the Ocfs2-test-devel
mailing list