[Ocfs2-tools-devel] [PATCH 3/4] Ocfs2-test: Add StartOpenMPI and openmpi_run apis for o2tf package.

tristan.ye tristan.ye at oracle.com
Thu Sep 11 18:54:32 PDT 2008


Marcos,

Thanks very much for your comments.

You're right,openmpi did not need a startup until its execution(that
means a mpirun will startup the mpi real-time environment like lamboot
did in lammpi).StartOpenMPI just make some prepares and do a simple
sanity check for openmpi,it can be used for preventing our more
important parallel tasks from being executed if the environment has not
been tuned well.

For the compatibility,since i'm not sure if you're still using lammpi
for testing currently,so you can take it as a workaround,it will not
hurt you testing if you still working with lammpi env.

Once we've made a explicit descision to move all testcases to
openmpi,all the lammpi related should be replaced by openmpi.it includes
the generic apis in o2tf,and the scripts to lanuch the mpi binaries.for
the naming convention,maybe we should only use mpi as a prefix
simply,such info like lammpi or openmpi should be kicked out. let me
know if it is the right now,i'll handle the porting works:)

Btw,openmpi really has no limitation as lammpi to restrict you a
none-root user to run mpi jobs.so we can do such restriction in scripts
by ourselves for the sake of security.
is that OK?


Regards,

Tristan.




On Thu, 2008-09-11 at 09:36 -0400, Marcos E. Matsunaga wrote:
> Tristan,
> 
> If I'm not wrong, OpenMpi doesn't have lamboot as it doesn't need a
> control process running on each node. I think it would be fair to
> replace StartOpenMPI by some other name since it doesn't actually
> start anything, but built the mpihosts file. Maybe SetOpenMPI would be
> a more suitable name. I think it is all handled by mpirun now with
> some minor changes in the syntax.
> 
> One thing to node is that we should use OpenMPI syntax for mpirun and
> not keep the compatible syntax. That could create some confusion.
> Regards,
> 
> Marcos Eduardo Matsunaga
> 
> Oracle USA
> Linux Engineering
> 
> “The statements and opinions expressed here are my own and do not
> necessarily represent those of Oracle Corporation.”
> 
> 
> Tristan Ye wrote: 
> > Here just add StartOpenMPI and openmpi_run for openmpi usage,
> > and reserve the old StartMPI and mpi_run to make all existing mpi
> > binaries still workable.
> > 
> > In the future,we may take the existing StartMPI and mpi_run with openmpi versions
> > since our test suite may migrate to openmpi platform future.
> > 
> > To do such porting from lammpi to openmpi,we're not going to be asked to do a lot,
> > Besides rewritting the generic python APIs such as StartMPI or mpi_run etc.we only
> > need to make minor modification for our python launcher(such as run_create_racer.py),
> > all existing MPI binaries(such create_racer.c) can be kept the same as before without any
> > modification.and also for the Makefile and compiling.
> > 
> > Signed-off-by: Tristan Ye <tristan.ye at oracle.com>
> > ---
> >  programs/python_common/o2tf.py |   77 ++++++++++++++++++++++++++++++++++++++++
> >  1 files changed, 77 insertions(+), 0 deletions(-)
> > 
> > diff --git a/programs/python_common/o2tf.py b/programs/python_common/o2tf.py
> > index ec8d613..47f452c 100644
> > --- a/programs/python_common/o2tf.py
> > +++ b/programs/python_common/o2tf.py
> > @@ -196,6 +196,39 @@ def untar(DEBUGON, destdir, tarfile, logfile):
> >  			printlog('o2tf.untar: Extraction ended.', 
> >  				logfile, 0, '')
> >  #
> > +# StartMPI for openmpi
> > +#
> > +def StartOpenMPI(DEBUGON, nodes, logfile):
> > +	"""
> > +	Since Openmpi does not need startup until executions issued,
> > +so ust do a sanity check here.
> > +	"""
> > +	from os import access,F_OK
> > +	if os.access(config.MPIRUN, F_OK) == 0:
> > +		printlog('o2tf.StartMPI: mpirun not found',
> > +			logfile, 0, '')
> > +		sys.exit(1)
> > +	if os.access(config.MPIHOSTS, F_OK) == 1:
> > +		os.system('rm -f ' + config.MPIHOSTS)
> > +	nodelist = string.split(nodes,',')
> > +	nodelen = len(nodelist)
> > +	fd = open(config.MPIHOSTS,'w',0)
> > +	for i in range(nodelen):
> > +		fd.write(nodelist[i] + '\n')
> > +	fd.close()
> > +	try:
> > +		if DEBUGON:
> > +			printlog('o2tf.StartOpenMPI: Trying to execute %s with \
> > +				 a simple command:hostname among (%s)' \
> > +				  % (config.MPIRUN, nodes),
> > +				logfile, 0, '')
> > +		os.system('%s  --hostfile %s %s' % (config.MPIRUN,
> > +			  config.MPIHOSTS, 'hostname'))
> > +
> > +	except os.error,inst:
> > +		printlog(str(inst), logfile, 0, '')
> > +		pass
> > +#
> >  # StartMPI is used by :
> >  #   - o2tf.py
> >  def StartMPI(DEBUGON, nodes, logfile):
> > @@ -261,6 +294,50 @@ def mpi_runparts(DEBUGON, nproc, cmd, nodes, logfile):
> >  		os.waitpid(pid,0)
> >  	except os.error:
> >  		pass
> > +
> > +#
> > +# Calls mpirun from openmpi
> > +#
> > +def openmpi_run(DEBUGON, nproc, cmd, nodes, remote_sh, logfile):
> > +	"""
> > +	Execute commands in parallel using OpenMPI.'
> > +	"""
> > +	from os import access,F_OK
> > +	found = 0
> > +	uname = os.uname()
> > +	nodelen = len(string.split(nodes,','))
> > +	if nproc == 'C':
> > +		nprocopt=''
> > +	else:
> > +		nprocopt='-np ' + str(nproc)
> > +
> > +	if remote_sh == '' or remote_sh == 'ssh':
> > +		shopt = '-mca pls_rsh_agent ssh:rsh'
> > +	else:
> > +		shopt = '-mca pls_rsh_agent rsh:ssh'
> > +	try:
> > +		if DEBUGON:
> > +			printlog('o2tf.mpi_run: MPIRUN = %s' % config.MPIRUN,
> > +				logfile, 0, '')
> > +			printlog('o2tf.mpi_run: nproc = %s' % nproc,
> > +				logfile, 0, '')
> > +			printlog('o2tf.mpi_run: nodelen = %d' % nodelen,
> > +				logfile, 0, '')
> > +			printlog('o2tf.mpi_run: shopt = %d' % shopt,
> > +				logfile, 0, '')
> > +			printlog('o2tf.mpi_run: cmd = %s' % cmd,
> > +				logfile, 0, '')
> > +		return os.spawnv(os.P_NOWAIT,
> > +			'/bin/bash', ['bash', '-xc',
> > +			config.MPIRUN + ' -mca btl tcp,self -mca btl_tcp_if_include eth0  %s %s --host %s %s' % \
> > +			( shopt, nprocopt, nodes, cmd)])
> > +	except os.error:
> > +		pass
> > +
> > +#
> > +# lamexec is used by :
> > +#   - 
> > +
> >  #
> >  # Calls mpirun (Original from the LAM/MPI Package)
> >  # mpi_run is used by :
> >   




More information about the Ocfs2-tools-devel mailing list