[OracleOSS] [TitleIndex] [WordIndex]

OCFS2/OpenMPI-HOWTO

OCFS2 TEST OPENMPI HOWTO

This page documents the steps to show how openmpi will be used in ocfs2-test.

Introduction

OpenMPI Project is an open source MPI-2 implementation that is developed and maintained by a consortium of academic, research, and industry partners, allows one to perform parallel computing and concurrent execution among multiple platforms. For ocfs2-test, Parallel execution of multi-nodes testcase is quite common and of great importance,it provides a convenient way for us to perform stress/concurrent tests for a fs,especially to test the lock in a cluster filesystem.

Shortly speacking, MPI-2, short for Message Passing Interface-2 is becoming a industrial standards which provides a series of MPI apis to perform parallel computing, and OpenMPI is almost the best library implementation of all its kind. that's also the reason why we moved our tests from lammpi to openmpi.

Steps

Openmpi is designed to be more convenient and efficient in use when being compared with lammpi. Followings are the steps how we use openmpi in ocfs2-test:

shell$ gunzip -c openmpi-1.3.3.tar.gz | tar xf -
shell$ cd openmpi-1.3.3
shell$ ./configure --prefix=/usr/local
<...lots of output...>
shell$ make all install

export PATH=$PATH:/usr/local/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

#Defaults    requiretty

1.pls_rsh_agent 2.btl 3.btl_tcp_if_include

pls_rsh_agent is used for specifying which remote connection method we use during cross-nodes execution(rsh or ssh), btl used for specify which network use for communication(e.g tcp or infiniband),while btl_tcp_if_include used for specifying which net interface to use(eth0 or eth1 etc). following is a example how they were used in ocfs2-test's practice:

NOTE: pls_rsh_agent has been deprecated in 1.3.0 or later version, use plm_rsh_agent instead, what's more, btl_tcp_if_include is no need to be set to specify a specific ethernet interface in 1.3.0 or later version.

[shell #]mpirun -mca pls_rsh_agent rsh:ssh -mca btl tcp,self -mca btl_tcp_if_include eth0 -np 4 --hostfile my_hosts ./xattr_multinode_test -i 2 -x 100 -n user -t normal -l 255 -s 65536   /storage/work/

1. Test by running a single common shell cmd:
[shell #]mpirun -mca pls_rsh_agent ssh:rsh -mca btl tcp,self -mca btl_tcp_if_include eth0 -np 4 --host ocfs2-test4,ocfs2-test5 hostname
Expected output should be:
ocfs2-test4.cn.oracle.com
ocfs2-test4.cn.oracle.com
ocfs2-test5.cn.oracle.com
ocfs2-test5.cn.oracle.com

2. Test by running a mpi binary among nodes,get the mpi src of testing binary, then compile it like:
[shell #]mpicc -o mpmd mpmd.c
Run it like:
mpirun -mca pls_rsh_agent ssh:rsh -mca btl tcp,self -mca btl_tcp_if_include eth0 -np 4 --host ocfs2-test4,ocfs2-test5 ./mpmd
Expected output should be like:
I'm Rank 0 out of 4 Ranks on Node[ocfs2-test4.cn.oracle.com].
I'm Rank 2 out of 4 Ranks on Node[ocfs2-test4.cn.oracle.com].
I'm Rank 1 out of 4 Ranks on Node[ocfs2-test5.cn.oracle.com].
I'm Rank 3 out of 4 Ranks on Node[ocfs2-test5.cn.oracle.com].
Rank 0 Sent msg(ocfs2-test4.cn.oracle.com) to Rank 1.
Rank 0 Sent msg(ocfs2-test4.cn.oracle.com) to Rank 2.
Rank 2 Received msg(ocfs2-test4.cn.oracle.com) from Rank 0.
Rank 1 Received msg(ocfs2-test4.cn.oracle.com) from Rank 0.
Rank 0 Sent msg(ocfs2-test4.cn.oracle.com) to Rank 3.
Now Rank 0 leave...
Now Rank 2 leave...
Rank 3 Received msg(ocfs2-test4.cn.oracle.com) from Rank 0.
Now Rank 1 leave...
Now Rank 3 leave...

If above tests were successfully gone through, it's also quite sure to be ready for ocfs2-test's multi-nodes tests.

[shell #]./configure --enable-heterogeneous

Links

Openmpi Official Site

Linux Test Project

Ocfs2 Test Cook Book


2011-12-23 01:01