[Ocfs2-tools-devel] [PATCH 3/4] Ocfs2-test: Add StartOpenMPI and openmpi_run apis for o2tf package.

Marcos E. Matsunaga Marcos.Matsunaga at oracle.com
Thu Sep 11 06:36:38 PDT 2008


Tristan,

If I'm not wrong, OpenMpi doesn't have lamboot as it doesn't need a
control process running on each node. I think it would be fair to
replace StartOpenMPI by some other name since it doesn't actually start
anything, but built the mpihosts file. Maybe SetOpenMPI would be a more
suitable name. I think it is all handled by mpirun now with some minor
changes in the syntax.

One thing to node is that we should use OpenMPI syntax for mpirun and
not keep the compatible syntax. That could create some confusion.

Regards,

Marcos Eduardo Matsunaga

Oracle USA
Linux Engineering

“The statements and opinions expressed here are my own and do not
necessarily represent those of Oracle Corporation.”



Tristan Ye wrote:
> Here just add StartOpenMPI and openmpi_run for openmpi usage,
> and reserve the old StartMPI and mpi_run to make all existing mpi
> binaries still workable.
>
> In the future,we may take the existing StartMPI and mpi_run with openmpi versions
> since our test suite may migrate to openmpi platform future.
>
> To do such porting from lammpi to openmpi,we're not going to be asked to do a lot,
> Besides rewritting the generic python APIs such as StartMPI or mpi_run etc.we only
> need to make minor modification for our python launcher(such as run_create_racer.py),
> all existing MPI binaries(such create_racer.c) can be kept the same as before without any
> modification.and also for the Makefile and compiling.
>
> Signed-off-by: Tristan Ye <tristan.ye at oracle.com>
> ---
>  programs/python_common/o2tf.py |   77 ++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 77 insertions(+), 0 deletions(-)
>
> diff --git a/programs/python_common/o2tf.py b/programs/python_common/o2tf.py
> index ec8d613..47f452c 100644
> --- a/programs/python_common/o2tf.py
> +++ b/programs/python_common/o2tf.py
> @@ -196,6 +196,39 @@ def untar(DEBUGON, destdir, tarfile, logfile):
>  			printlog('o2tf.untar: Extraction ended.', 
>  				logfile, 0, '')
>  #
> +# StartMPI for openmpi
> +#
> +def StartOpenMPI(DEBUGON, nodes, logfile):
> +	"""
> +	Since Openmpi does not need startup until executions issued,
> +so ust do a sanity check here.
> +	"""
> +	from os import access,F_OK
> +	if os.access(config.MPIRUN, F_OK) == 0:
> +		printlog('o2tf.StartMPI: mpirun not found',
> +			logfile, 0, '')
> +		sys.exit(1)
> +	if os.access(config.MPIHOSTS, F_OK) == 1:
> +		os.system('rm -f ' + config.MPIHOSTS)
> +	nodelist = string.split(nodes,',')
> +	nodelen = len(nodelist)
> +	fd = open(config.MPIHOSTS,'w',0)
> +	for i in range(nodelen):
> +		fd.write(nodelist[i] + '\n')
> +	fd.close()
> +	try:
> +		if DEBUGON:
> +			printlog('o2tf.StartOpenMPI: Trying to execute %s with \
> +				 a simple command:hostname among (%s)' \
> +				  % (config.MPIRUN, nodes),
> +				logfile, 0, '')
> +		os.system('%s  --hostfile %s %s' % (config.MPIRUN,
> +			  config.MPIHOSTS, 'hostname'))
> +
> +	except os.error,inst:
> +		printlog(str(inst), logfile, 0, '')
> +		pass
> +#
>  # StartMPI is used by :
>  #   - o2tf.py
>  def StartMPI(DEBUGON, nodes, logfile):
> @@ -261,6 +294,50 @@ def mpi_runparts(DEBUGON, nproc, cmd, nodes, logfile):
>  		os.waitpid(pid,0)
>  	except os.error:
>  		pass
> +
> +#
> +# Calls mpirun from openmpi
> +#
> +def openmpi_run(DEBUGON, nproc, cmd, nodes, remote_sh, logfile):
> +	"""
> +	Execute commands in parallel using OpenMPI.'
> +	"""
> +	from os import access,F_OK
> +	found = 0
> +	uname = os.uname()
> +	nodelen = len(string.split(nodes,','))
> +	if nproc == 'C':
> +		nprocopt=''
> +	else:
> +		nprocopt='-np ' + str(nproc)
> +
> +	if remote_sh == '' or remote_sh == 'ssh':
> +		shopt = '-mca pls_rsh_agent ssh:rsh'
> +	else:
> +		shopt = '-mca pls_rsh_agent rsh:ssh'
> +	try:
> +		if DEBUGON:
> +			printlog('o2tf.mpi_run: MPIRUN = %s' % config.MPIRUN,
> +				logfile, 0, '')
> +			printlog('o2tf.mpi_run: nproc = %s' % nproc,
> +				logfile, 0, '')
> +			printlog('o2tf.mpi_run: nodelen = %d' % nodelen,
> +				logfile, 0, '')
> +			printlog('o2tf.mpi_run: shopt = %d' % shopt,
> +				logfile, 0, '')
> +			printlog('o2tf.mpi_run: cmd = %s' % cmd,
> +				logfile, 0, '')
> +		return os.spawnv(os.P_NOWAIT,
> +			'/bin/bash', ['bash', '-xc',
> +			config.MPIRUN + ' -mca btl tcp,self -mca btl_tcp_if_include eth0  %s %s --host %s %s' % \
> +			( shopt, nprocopt, nodes, cmd)])
> +	except os.error:
> +		pass
> +
> +#
> +# lamexec is used by :
> +#   - 
> +
>  #
>  # Calls mpirun (Original from the LAM/MPI Package)
>  # mpi_run is used by :
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-tools-devel/attachments/20080911/a9a64c8d/attachment-0001.html 


More information about the Ocfs2-tools-devel mailing list