BUILDING YOU OWN CENTOS COMPUTER CLUSTER This document contains the instructions to build a cluster of PC running CENTOS Linux system that has network sharing file (NSF). The modus operandi of the cluster mimics that of the Rocks Cluster (see http://www.rocksclusters.org/). The cluster contains a frontend node plus a number of compute nodes. All nodes are connected via a 100/1000 MB/s switch. In principle this document can also be applied to other Linux OS with or without modification. The 'cluster' is not one that is defined in a strict sense (like the Rocks Clusters), as our's is much less in complexity. If you want to use this manual to set up your own CENTOS cluster in your institution, you need to modify only the following information which is specific to your network enviroment. The default values used in this document are as below. You will need these information to build the cluster, namely: DOMAIN, IPADDR, GATEWAY, DNS1 and DNS2. Ask your network admin for these information. For my case, these are HOSTNAME=chakra IPADDR=10.205.18.133 DOMAIN=usm.my GATEWAY=10.205.19.254 DNS1=10.202.1.27 DNS2=10.202.1.6 Hardware requirements: 1. One frontend PC + at least one compute node PC. 2. Two hardisks on frontend (one small capacity and the other large, e.g., 500 GiB + 1 TiB). 3. LAN cables, a 100/1000 MB/s switch. 4. The frontend node has to be equipped with two LAN cards, a built-in and an externally plug-in one. 5. All compute nodes must be equipped with a minimum of one LAN card (either built-in or external). Naming convention: We shall name these PCs respectively: $HOSTNAME (frontend node), node1, node2, node3, .... For the frontend node, we shall denote the built-in LAN card eth0 while the external LAN card eth1. We shall denote the LAN card in the compute nodes that is connected to the switch eth0. Important IPs to take note: The IP for the frontend at eth0 is by default set to: 192.168.1.10 The IPs for node1, node2, node3 etc. at their respective eth0 are by default set to (in sequential order): 192.168.1.21, 192.168.1.22, 192.168.1.23, ... To build a CentOS cluster, follow the following procedure step-by-step in sequential order. (0) Use Rufus or other software to burn the lastest CentOS iso into a bootable thumb drive. This manuscript is prepared based on CentOS-7-x86_64-DVD-1804, but in principle it should also work for other version of CENTOS (version 6 or above). (1) Connect all eth0 ports of all the PCs in the cluster to a 100/1000 MB/s switch. eth1 of the frontend is to be connected to the internet network (e.g., in USM, it is the usm.my network). (2) Install CENTOS using the bootable thumb drive into the frontend's smaller hardisk. Leave the larger hardisk as it is for the moment. Install CENTOS in all other compute nodes. All nodes (including the frontend) should use the installation option of 'Development and Creative Workstation'. Choose to add all software packages offered at the installation prompt. Use a common root password for the frontend as well as all compute nodes. (2.5) Check out the label for the hard disks in the frontend using fdisk -l. Say the larger hard disk is labelled /dev/sdb. Mount this hard disk in the folder /export in the root directory in the frontend CENTOS by typing the following command line (as su) in the terminal: mkdir /export chmod -R 777 /export mount -t xfs /dev/sdb /export cp /etc/fstab /etc/fstab.orig Mount this hardisk permanantlty by adding the line /dev/sdb /export xfs rw 2 2 to /etc/fstab. The permanant mounting of /dev/sdb to /export will take place after a reboot. In case the hard disk is not formatted properly, it may refuse to be mounted. The hard disk can be formatted using mkfs.xfs -f /dev/sdb to forcefully format it into XFS format. The task of formatting and mouting the external hard disk to the Centos root directory can be executed by running the script sh mount_export.txt The script mount_export.txt is downloadable from http://comsics.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/ (3) Install CENTOS in all nodes using the same option as that used in setting up the frontend. (4) Download the following script: http://comsics.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/centos_frontend-1of2.txt. Customize the following variables in centos_frontend-1of2.txt script to suit your case: HOSTNAME= IPADDR= DOMAIN= GATEWAY= DNS1= DNS2= Then ./sh centos_frontend-1of2.txt in the frontend as su. centos_frontend-1of2.txt will do the necessary preparatory configuration for the frontend. These include (i) setting the SELINUX to permissive in /etc/sysconfig/selinux, (ii) activating sshd.service so that ssh can work, and (iii) setting the IP of the frontend by making changes to its network configuration such as assigning IP address, setting up DNS server, etc. (The instruction is based on http://www.techkaki.com/2011/08/how-to-configure-static-ip-address-on-centos-6/). The frontend will reboot at the end of centos_frontend-1of2.txt. (5) After rebooting the frontend, log in as root. Check the frontend to see if the following functions are configured successfully by centos_frontend-1of2.txt. (a) Check the mode of SELINUX using the two commands: getenforce sestatus The SELINUX should be set to permissive. This can be confirmed manually by checking if the file /etc/sysconfig/selinux has a line stating 'SELINUX=permissive'. (b) Check manually that ssh is enabled. (c) Check manually if both eth0 and eth1 are connected to the switch and internet respectively. This can be done by issuing the following commands one-by-one (i) ping google.com (ii) ssh into the frontend from a third party terminal (ii) ifconfig (iii) cat /etc/sysconfig/network-scripts/ifcfg-$nc (iv) cat /etc/sysconfig/network-scripts/ifcfg-$nc1 where nc0=$(lshw -class network | grep -E 'Ethernet interface|logical name' | grep 'logical name' | awk '{print $3}' | awk '!/vir/' | tail -n1) nc1=$(lshw -class network | grep -E 'Ethernet interface|logical name' | grep 'logical name' | awk '{print $3}' | awk '!/vir/' | awk 'NR==1{print}') (6) Check if the HWADDR for both $nc0 and $nc1 are explicitly specified. Assure that (a) DOMAIN=local, DNS1=127.0.0.1 for $nc0 (b) DOMAIN=usm.my, DNS1=10.202.1.27, DNS2=10.202.1.6 for $nc1 For your case, the values for DOMAIN, DNS1, DNS2 for $nc1 have to be the values of the variables you have set in centos_frontend-1of2.txt. The setting of the network configuration required (as stated above) can be manually configured, if needed, by tweaking the NetworkManager sitting in the upper right hand side of the screen. In case the NetworkManager icon fails to appear, type systemctl NetworkManager restart or service NetworkManager restart (try out which one works) (7) Alternatively, the setting of the required network configuration (as stated in item (6)) can be manually configured, if needed, by editing the files /etc/sysconfig/network-scripts/ifcfg-$nc0 /etc/sysconfig/network-scripts/ifcfg-$nc1 (8) In any case, both network cards must be active and connected before proceeding to the next step. Reboot the frontend if you have done some manual configurations. Often both cards will be connected after rebooting. Be remined that the network card $nc0 is to be connected to the switch, while $nc1 to the internet. (9) Download the following script: http://comsics.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/centos_frontend-2of2.txt. Execute ./centos_frontend-2of2.txt in the frontend as root. It will reboot at the end of the script. All the steps described until this stage must be completed before initiating the following steps. (10) Download the following script: http://comsics.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/centos_node-1of2.txt. Save centos_node-1of2.txt into a pendrive and plug it into each node manually as at the present stage the nodes may be still unable to go online. (10.5) This is a manually process. For each individual node, edit the value of $ipaddlastnumber in IPADDR=192.168.1.$ipaddlastnumber in the centos_node-1of2.txt script. Assign a unique value of $ipaddlastnumber for each node, begining with integer 21. For example, when installing the first node, set ipaddlastnumber=21. When installing the second node, set ipaddlastnumber=22. (10.6). Assure that the a network card in the compute node is connected to the switch. Issue the following commands (as su) ifconfig service NetworkManager restart service network restart or use the Network Manager (by clicking the icon in the upper right corner of the CentOS GUI) to assure that connection to the switch has been established. This may require some manual tweaking. If this local network connection (between the node and frontend via the switch) is established, the node should see a positive response when pinging the frontend in the terminal of the node ping 192.168.1.10 192.168.1.10 is the default local IP of frontend. (10.7) Configure a compute node by executing the script ./centos_node-1of2.txt in each node separately, with the corresponding value of $ipaddlastnumber set accordingly. Make sure that each of the compute node is connected to the swtich via the existing LAN port of the compute node, as mentioned in the previous step. (10.8) Download the following script: http://comsics.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/centos_node-2of2.txt. Save centos_node-2of2.txt into a pendrive and plug it into each node manually in case the node is still unable to go online. (11) After rebooting a node, log in again as root to execute the file centos_node-2of2.txt. As a note, the script centos_node-2of2.txt to customise the node is not mandatory. (11.5) Repeat steps (10.5) - (11) for all other nodes. (12) To add a new user to the cluster, execute ./useradd_recursive.txt from the frontend as su. The script is downloadable from http://comsics.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/useradd_recursive.txt. (13) After adding a new user with useradd_recursive.txt, it must be followed immediately by manually executing the script ./pwlssh_recursive.txt as the new user from the frontend. This script will render all nodes and the frontend to be connected via passwordless ssh for the user who executed the script. pwlssh_recursive.txt is available from http://comsics.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/pwlssh_recursive.txt. (14) This finishes the contruction of the cluster. Users which are added in the frontend via useradd_recursive.txt and pwlssh_recursive.txt issued in the frontend can ssh without password among the nodes and the frontend. An user's home directory is /share/home/$USER in all nodes. The home directory is saved in the frontend's hard disk, not in the local node's. This is the feature of the nfs. Yoon Tiem Leong Universiti Sains Malaysia 11800 USM Penang, Malaysia Date: 7 Aug 2019