OpenMPI is available for install from the AppStream Repo and EPEL Repo is also best enabled for its required associated packages in a CentOS 9 (RHEL) environment of separate like installed compute nodes in a cluster.
Install OpenMPI with:
dnf -y install openmpi
The package in RHEL AppStream installs OpenMPI and its shell module file to the system, and given OpenMPI is a runtime based environment, there’s no server or such to run, the OpenMPI command line toolkit uses SSH to connect to the nodes in the OpenMPI cluster of hosts.
So ensure that the users who will be running OpenMPI commands have SSH Key based authentication configured on all hosts in the cluster. ie: populate the authorized_keys files on all relevant hosts with the user’s SSH Public Key text.
Once OpenMPI is invoked on a host in the cluster, there are dynamic network port based communications undertaken by the OpenMPI commands running on each node and in a RHEL FirewallD environment, the OpenMPI commands need to be configured to use a known range of network ports, which can be opened via firewall-cmd commands.
In our example here, we’ll set OpenMPI commands to use ports from 50000 to 51999 and these ports are opened using the following FirewallD based commands on each node in the cluster:
firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" source address="10.0.0.0/24" port protocol="tcp" port="50000-51999" accept' && firewall-cmd --reload
Note: assuming the nodes are in a network 10.0.0.0/24 in this example.
Then these network port ranges can be used on the OpenMPI “mpirun” command line with the following parameters:
--mca btl_tcp_port_min_v4 50001 --mca btl_tcp_port_range_v4 30 --mca oob_tcp_dynamic_ipv4_ports 51001-51031
The standard shell user that’s to run the OpenMPI commands in their shell needs a MPI Hosts file which lists the hostname and other settings of each node in the OpenMPI cluster, an example file could be named “mpi_hosts” in the user’s home directory and a sample of it for a 3 node cluster, where each node’s hostname and IP Address is resolvable via DNS or is listed in each host’s /etc/hosts file is:
node1 slots=2
node2 slots=2
node3 slots=2
An example complete mpirun command line to run the “hostname” command 6 times across a 3 node cluster is:
/usr/lib64/openmpi/bin/mpirun --mca btl_tcp_port_min_v4 50001 --mca btl_tcp_port_range_v4 30 --mca oob_tcp_dynamic_ipv4_ports 51001-51031 --hostfile ~/mpi_hosts --path /usr/lib64/openmpi/bin -np 6 hostname
Note: the –path is required so the binaries of OpenMPI can be found on each node.
The above command will results in the hostname command being run 6 times across 3 nodes, with sample output looking like:
node1
node3
node2
node1
node2
node3