OCFS2 TEST OPENMPI HOWTO
This page documents the steps to show how openmpi will be used in ocfs2-test.
Introduction
OpenMPI Project is an open source MPI-2 implementation that is developed and maintained by a consortium of academic, research, and industry partners, allows one to perform parallel computing and concurrent execution among multiple platforms. For ocfs2-test, Parallel execution of multi-nodes testcase is quite common and of great importance,it provides a convenient way for us to perform stress/concurrent tests for a fs,especially to test the lock in a cluster filesystem.
Shortly speacking, MPI-2, short for Message Passing Interface-2 is becoming a industrial standards which provides a series of MPI apis to perform parallel computing, and OpenMPI is almost the best library implementation of all its kind. that's also the reason why we moved our tests from lammpi to openmpi.
Steps
Openmpi is designed to be more convenient and efficient in use when being compared with lammpi. Followings are the steps how we use openmpi in ocfs2-test:
Get the openmpi source tarball or rpm package from its official website, 1.2.5 or later could be found here.
- For everyone else, in general, all you need to do for building is expand the tarball, run the provided configure script, and then run "make all install". For example:
shell$ gunzip -c openmpi-1.3.3.tar.gz | tar xf -
shell$ cd openmpi-1.3.3
shell$ ./configure --prefix=/usr/local
<...lots of output...>
shell$ make all install
- Openmpi should be installed under /usr/local/lib and /usr/local/bin by default, you'd better include this in PATH and LD_LIBRARY_PATH envs, such as add following in your ~/.bashrc
export PATH=$PATH:/usr/local/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
- Openmpi will choose rsh or ssh to perform remote execution cross-nodes, we therefore have to either setup passwordless rsh or ssh connection.it will not be described in detail for a brevity in this document.
- You also need to disable 'requiretty' option in sudo config file, comment this line in 'visudo', or a remote sudo execution will hit error due to no tty
#Defaults requiretty
- Openmpi has lots of parameter to be set when running called mac parameters, we can specify this in running realtime or modify the configuration file to let it take effect permanently, for ocfs2-test, we almost did nothing but use its default setting, several important mac parameters deserves our attention:
1.pls_rsh_agent 2.btl 3.btl_tcp_if_include
pls_rsh_agent is used for specifying which remote connection method we use during cross-nodes execution(rsh or ssh), btl used for specify which network use for communication(e.g tcp or infiniband),while btl_tcp_if_include used for specifying which net interface to use(eth0 or eth1 etc). following is a example how they were used in ocfs2-test's practice:
NOTE: pls_rsh_agent has been deprecated in 1.3.0 or later version, use plm_rsh_agent instead, what's more, btl_tcp_if_include is no need to be set to specify a specific ethernet interface in 1.3.0 or later version.
[shell #]mpirun -mca pls_rsh_agent rsh:ssh -mca btl tcp,self -mca btl_tcp_if_include eth0 -np 4 --hostfile my_hosts ./xattr_multinode_test -i 2 -x 100 -n user -t normal -l 255 -s 65536 /storage/work/
After all above steps done, we're now ready to test if openmpi works for us, you may get a example of testing binary mpmd.c
1. Test by running a single common shell cmd:
[shell #]mpirun -mca pls_rsh_agent ssh:rsh -mca btl tcp,self -mca btl_tcp_if_include eth0 -np 4 --host ocfs2-test4,ocfs2-test5 hostname
Expected output should be:
ocfs2-test4.cn.oracle.com
ocfs2-test4.cn.oracle.com
ocfs2-test5.cn.oracle.com
ocfs2-test5.cn.oracle.com
2. Test by running a mpi binary among nodes,get the mpi src of testing binary, then compile it like:
[shell #]mpicc -o mpmd mpmd.c
Run it like:
mpirun -mca pls_rsh_agent ssh:rsh -mca btl tcp,self -mca btl_tcp_if_include eth0 -np 4 --host ocfs2-test4,ocfs2-test5 ./mpmd
Expected output should be like:
I'm Rank 0 out of 4 Ranks on Node[ocfs2-test4.cn.oracle.com].
I'm Rank 2 out of 4 Ranks on Node[ocfs2-test4.cn.oracle.com].
I'm Rank 1 out of 4 Ranks on Node[ocfs2-test5.cn.oracle.com].
I'm Rank 3 out of 4 Ranks on Node[ocfs2-test5.cn.oracle.com].
Rank 0 Sent msg(ocfs2-test4.cn.oracle.com) to Rank 1.
Rank 0 Sent msg(ocfs2-test4.cn.oracle.com) to Rank 2.
Rank 2 Received msg(ocfs2-test4.cn.oracle.com) from Rank 0.
Rank 1 Received msg(ocfs2-test4.cn.oracle.com) from Rank 0.
Rank 0 Sent msg(ocfs2-test4.cn.oracle.com) to Rank 3.
Now Rank 0 leave...
Now Rank 2 leave...
Rank 3 Received msg(ocfs2-test4.cn.oracle.com) from Rank 0.
Now Rank 1 leave...
Now Rank 3 leave...
If above tests were successfully gone through, it's also quite sure to be ready for ocfs2-test's multi-nodes tests.
- Please note that the src of a mpi program is platform-independent, while it's binary format is arch-dependent,as a result, you'd better build/compile your own mpi binaries/tests separately among different nodes, and also keep in mind that your binaries are sure to be placed in the same absolute path among nodes. say you're going to do concurrent tests among node1~node4,you'd better compile your tests separately in each node, then put the tests in a same location such as /work/testplace/xxx-tests/,and keeps all of the tests in the same version.
- One more note is, you have to enable a configure option to allow heterogeneous platform support, which supposed to be disabled by defualt. tests therefore can be run on multiple archs when it's on, it can be done during building as follows. and we recommend you'd better enable this during a normal building.
[shell #]./configure --enable-heterogeneous