BUILD YOUR OWN CENTOS COMPUTER CLUSTER ====================================== This document contains the instructions to build a cluster of PC running CENTOS Linux system that has network sharing file (NSF). The modus operandi of the cluster mimics that of the Rocks Cluster (see http://www.rocksclusters.org/). The cluster contains a frontend node plus a number of compute nodes. All nodes are connected via a 100/1000 MB/s switch. In principle this document can also be applied to other Linux OS with or without modification. The 'cluster' is not one that is defined in a strict sense (like the Rocks Clusters), as our's is much less in complexity. If you want to use this manual to set up your own CENTOS cluster in your institution, you need to modify only the following information which is specific to your network enviroment. The default values used in this document are as below. You will need these information to build the cluster, namely: DOMAIN, IPADDR, GATEWAY, DNS1 and DNS2 (DNS2 is optional). Ask your network admin for these information. For my case, these are HOSTNAME=chakra IPADDR=10.205.18.133 DOMAIN=usm.my GATEWAY=10.205.19.254 DNS1=10.202.1.27 DNS2=10.202.1.6 ### DNS2 is optional ====================== Hardware requirements: ====================== i). One frontend PC + at least one compute node PC. ii). Two hardisks on frontend (one small capacity and the other large, e.g., 500 GiB + 1 TiB). iii). LAN cables, a 100/1000 MB/s switch. iv). The frontend node has to be equipped with two network cards, a built-in and an externally plug-in one. v). All compute nodes must be equipped with a minimum of one network card (either built-in or external). Naming convention: We shall denote the built-in network card eth0 while the external network card eth1. eth0 is the network card that connects to the swtich, while eth1 the network card that connects to the internet. Note that is not material to stick strictly to these convention. The convention is merely for the sake of naming consistency. Important IPs to take note: The IP for the frontend at eth0 is by default set to: 192.168.1.10 The IPs for node1, node2, node3 etc. at their respective eth0 are by default set to (in sequential order): 192.168.1.21, 192.168.1.22, 192.168.1.23, ... ========================================================================================== To build a CentOS cluster, follow the following procedure step-by-step in sequential order. ========================================================================================== (2) Use Rufus or other software to burn the lastest CentOS iso into a bootable thumb drive. This manuscript is prepared based on CentOS-7-x86_64-DVD-1804, but in principle it should also work for other version of CENTOS (version 6 or above). (4) Connect all eth0 ports of all the PCs in the cluster to a 100/1000 MB/s switch. eth1 of the frontend is to be connected to the internet network (e.g., in USM, it is the usm.my network). (6) Install CENTOS using the bootable thumb drive into the frontend's smaller hardisk. Leave the larger hardisk as it is for the moment. For the sake of uniformity, you should use the installation option of 'Development and Creative Workstation'. Choose to add all software packages offered at the installation prompt. Use a common root password for the frontend as well as all compute nodes. (8) Check out the label for the hard disks in the frontend using fdisk -l. Say the larger hard disk is labelled /dev/sdb. Mount this hard disk in the folder /export in the root directory in the frontend CENTOS by typing the following command line (as su) in the terminal: mkdir /export chmod -R 777 /export mount -t xfs /dev/sdb /export cp /etc/fstab /etc/fstab.orig Mount this hardisk permanantlty by adding the line /dev/sdb /export xfs rw 2 2 to /etc/fstab. The permanant mounting of /dev/sdb to /export will take place after a reboot. In case the hard disk is not formatted properly, it may refuse to be mounted. The hard disk can be formatted using mkfs.xfs -f /dev/sdb to forcefully format it into XFS format. The task of formatting and mouting the external hard disk to the Centos root directory is well described in the following executable instruction mount_export.txt downloadable from http://anicca.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/mount_export.txt (9) After the frontend is up, use the NetworkManager GUI to establish an connection to the internet via the eth1 network card. This is a manual procedure. NetworkManager can be found in the following manner: Settings -> Network. Under the 'Wired' panel, click on the network card item. In case you don't see the NetworkManager icon (which could happen when a fresh copy of CENTOS is just set up), issue the command 'service NetworkManager restart' or 'service NetworkManager restart' (try out which one works) in the terminal to launch it. ============================================================= (10) Download the following script into /root/ ============================================================= http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/frontend-1of3.txt Customize the following variables in centos_frontend-1of3.txt script to suit your case: HOSTNAME= IPADDR= DOMAIN= GATEWAY= DNS1= DNS2= Then chmod +x sh frontend-1of3.txt ./frontend-1of3.txt in the frontend as su. frontend-1of3.txt will do the necessary preparatory configuration for the frontend. These include (i) setting the SELINUX to permissive in /etc/sysconfig/selinux, (ii) activating sshd.service so that ssh can work, and (iii) setting the IP of the frontend by making changes to its network configuration such as assigning IP address, setting up DNS server, etc. (The instruction is based on http://www.techkaki.com/2011/08/how-to-configure-static-ip-address-on-centos-6/). (iv) create /state/partition1. (v) etc. The frontend will reboot at the end of frontend-1of3.txt. (12) After rebooting the frontend, log in as root. Check the frontend to see if the following functions are configured successfully by frontend-1of3.txt. (a) Check the mode of SELINUX using the follwing commands: getenforce sestatus The SELINUX should be set to permissive. This can be independently confirmed by checking if the file /etc/sysconfig/selinux has a line stating 'SELINUX=permissive'. (b) Check manually if both eth0 and eth1 are connected to the switch and internet respectively. This can be done by issuing the following commands one-by-one (i) ping google.com (ii) ssh into the frontend from a third party terminal (iii) ssh into a third party terminal from the frontend (iv) ifconfig (v) cat /etc/sysconfig/network-scripts/ifcfg-$nc0 (vi) cat /etc/sysconfig/network-scripts/ifcfg-$nc1 where nc0=$(lshw -class network | grep -E 'Ethernet interface|logical name' | grep 'logical name' | awk '{print $3}' | awk '!/vir/' | tail -n1) nc1=$(lshw -class network | grep -E 'Ethernet interface|logical name' | grep 'logical name' | awk '{print $3}' | awk '!/vir/' | awk 'NR==1{print}') (14) Check if the HWADDR for both $nc0 and $nc1 are explicitly specified. Assure that (a) DOMAIN=local, DNS1=127.0.0.1 for the network card $nc0 (=eth0) (b) DOMAIN=xxx, DNS1=xxx, DNS2=xxx for network card $nc1 (=eth1) where xxx are the values for DOMAIN, DNS1, DNS2 for $nc1 you have set in frontend-1of3.txt. The setting of the network configuration as attempted to be established by the frontend-1of3.txt script may fail to work. In such as case, the network configuration can be manually configured by tweaking the NetworkManager. (15) Alternatively, the setting of the required network configuration (as stated in item (14)) can be manually configured, if needed, by editing the files /etc/sysconfig/network-scripts/ifcfg-$nc0 /etc/sysconfig/network-scripts/ifcfg-$nc1 However, this is an alternative option that may result in unexpected glitches. Avoid doing it if you don't know how to do it exactly. (16) In any case, both network cards must be active and connected before proceeding to the next step. Reboot the frontend if you have done some manual configurations. Often both cards will be connected after rebooting. Be remined that the network card $nc0 (=eth0) is to be connected to the switch, while $nc1(=eth1) to the internet in our convention. =========================================================== (18) Download and execute the following script in /root/ =========================================================== wget http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/frontend-2of3.txt chmod +x frontend-2of3.txt ./frontend-2of3.txt CentOS will reboot at the end of the script. All the steps described until this stage must be completed before initiating the following steps. =========================================================================== (20) Download and execute the following script into /share/apps/local/bin =========================================================================== cd /share/apps/local/bin wget http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/frontend-3of3.txt chmod +x frontend-3of3.txt ./frontend-3of3.txt This will download all required scripts into /share/apps/local/bin. In addition, it will also execute the basic_packages.txt script. This finishes the part on setting up the frontend of the cluster. Now proceed to the setting up of the nodes. =========================== Install CENTOS in each node =========================== (22) Install CENTOS in all other compute nodes using CentOS installation file in a pen drive. All nodes should use the same installation options as that used in setting up the frontend. (24) During the installation of a node, physically connect the network card of the node (which is referred to as 'eth0' here; by default we choose the built-in/internal network card as the eth0) to the switch. 24.2) Manually identify the value of The latest $ipaddlastnumber. This is the number which appears in the form of compute-0-c$ipaddlastnumber. Figure out the right value of it by checking the latest value of from the list of compute-0-c$ipaddlastnumber in /etc/hosts in the frontend. For example, cat /etc/hosts 10.205.19.225 anicca.usm.my 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 192.168.1.10 anicca.local anicca 192.168.1.22 compute-0-22.local compute-0-22 c22 192.168.1.23 compute-0-23.local compute-0-23 c23 192.168.1.24 compute-0-24.local compute-0-24 c24 192.168.1.25 compute-0-25.local compute-0-25 c25 192.168.1.26 compute-0-26.local compute-0-26 c26 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 In this example, the value of latest ipaddlastnumber from existing list is ipaddlastnumber=26. Hence, to add a new node to the list in /etc/hosts, it has to be assigned a value of ipaddlastnumber=27 This is a very important value used in the following procedure. 24.5) During the installation procedure of a node using the CentOS installation pen drive, manaully set the IP address using NetworkManager to: For eht1: Click the tab IPv4. Set the network card to use DHCP, and make sure to check 'Connect Automatically' For eht0: Click the tab IPV4. Check the 'manual' option. Use the following setting: Address 192.168.1.10 Netmask 255.255.0.0 Gateway 192.168.1.$ipaddlastnumber DNS 192.168.1.10 (25) Note that if a node is equipped with two network cards, one will be used to connect to the switch (eth0) and the other (eth1) internet-accssing network. Having two network cards in a node is optional but preferred. If only one network card is present, it should be connected to the switch. This card would be identified as the eth0, and there will be no eth1 card. Execute the coc-nodes_manual script ----------------------------------- (26) Note that at this point, the node is not yet establish a ssh connection to the frontend or the internet. Save the script http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/coc-nodes_manual to a pendrive. Copy the file into the /root directory (or anywhere) of the freshly installed node. (26) Edit the file coc-nodes_manual to set the correct value of ipaddlastnumber as mentioned in step (24.2). (27) Run the script: chmod +x coc-nodes_manual ./coc-nodes_manual (27.5) Among others, the major functions of the script coc-nodes_manual include: a) Manually set the value of $ipaddlastnumber for a new node. Check that after the coc-nodes_manual script execution, the node is connected to the frontend as well as the internet. If no connection to both is establed, it may be due to the wrong guess of the identities of eth0 and eth1 by the coc-sync-users script. If this is the case, manually swap the LAN cables to both network cards to see if the expected connection can be established. You may have to issue the command service network restart after swapping the LAN cables of the network cards. b) mount the node to the frontend as a nsf node, setting the hostname of the node as 'compute-0-$ipaddlastnumber', alias c-$ipaddlastnumber. It will create a shared directory /share/ in the c-$ipaddlastnumber node, which is a nsf directory physically kept in the hard disk of the frontend. It will set the hostname in /etc/hostname of the new node, as well as generating the necessary content in the .bashrc file for root in the node (via gen_bashrc-root). In addition, a local folder /state/partition1 will also be created in the node. c) set the following items in /etc/ssh/sshd_config and /etc/ssh/ssh_config is set to 'no', namely, GSSAPIAuthentication no (d) It will also execute the basic_packages.txt script. (28) At the end of the execution of the coc-nodes_manual, you will be prompted to establish a passwordless ssh for the root from the current node -> the frontend (if ssh has been established). (29) After the completion of the execution of the coc-nodes_manual, reboot the node manually. (37) After rebooting a node, you will be forced to create a local user. For the sake of consistency, create a local user 'user'. But the naming of the user is immaterial. (38) The setting of the IP can be checked by issuing ifconfig to assure that connection to the switch has been established. (39) In the case where the node has successfully established a connection to the frontend via the eth0 card, (a) the node should see a positive response when pinging the frontend from the terminal of the node: ping 192.168.1.10. (b) the node should connect to the internet if there are two network cards. (c) you should see similar output as in the following samples, eno1: flags=4163 mtu 1500 inet 10.205.19.29 netmask 255.255.254.0 broadcast 10.205.19.255 inet6 fe80::c179:dee0:8b19:51b0 prefixlen 64 scopeid 0x20 ether 54:04:a6:28:c8:0c txqueuelen 1000 (Ethernet) RX packets 34649849 bytes 3348276078 (3.1 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 411108 bytes 34124296 (32.5 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 18 memory 0xfb600000-fb620000 enp8s0: flags=4163 mtu 1500 inet 192.168.1.21 netmask 255.255.0.0 broadcast 192.168.255.255 inet6 fe80::a109:26ab:e943:65ae prefixlen 64 scopeid 0x20 ether 1c:af:f7:ed:32:d3 txqueuelen 1000 (Ethernet) RX packets 217611 bytes 179611464 (171.2 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 171507 bytes 17822576 (16.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 (40) Check that (i) indeed the node has the correct ipaddress, i.e., 192.168.1.$ipaddlastnumber, (ii) it can ssh into the frontend via ssh 192.168.1.10, (iii) the frontend can ssh into the node via ssh c-.$ipaddlastnumber or ssh compute-0-.$ipaddlastnumber. (iv) check also that ls -la /share exists, and see at least a directory /share/apps/. (v) try to copy a large file (~200 MB) to and fro the frontend and a node. If the eth0 connection via the swith works properly, the transfer rate of the file copying process should be of the order > 20 MB/s (or at least larger than ~12 MB/s). (vi) ssh other existing nodes via the command e.g., ssh c-21 or ssh 192.168.1.21 (assuming c-21 is not itself). ================================================================== Configure a new node after it has been freshly setup ================================================================== (52) After succesfully executing ./coc-nodes_manual in a new node, execute from the new node the following script coc-after-newnode It will sync all users into the new node against those already existed in the frontend. (48) Repeat steps (22) - (52) to set up all other nodes. ============================= Customization of the frontend ============================= Once a frontend is up and running, install the following three categories of packages step-by-step. (56) CUDA packages, to be installed in the root directory. Manual responses are required when running this installation packages. Reboot required. cuda-package.txt (58) packages0, to be installed in the /share/apps directory. Manual responses are required when running this installation packages. Could be time-consuming since it involve large files. Issue the command: packages0.txt (60) packages1, to be installed in the /share/apps directory. Automated installation. Issue the command: packages1.txt Despite being time consuming, this is an auto installation process. ========================== Customization of the nodes ========================== Once a node is up and running, install the following three categories of packages step-by-step. (64) CUDA packages, to be installed in the root directory. Manual responses are required when running this installation packages. Reboot required. cuda-package.txt (66) statepartition1_packages, to be installed in the /state/partition1 directory. Manual responses are required when running this installation packages. Could be time-consuming since it involve large files. Issue the command line: statepartition1_packages.txt ============= Maintainenece ============= Adding an user to the cluster ----------------------------- (72) To add a new user to the cluster, issue the following command in a terminal in the frontend as su: coc-add_new_user The root will be promtped for the username to be added. After providing the username to be added, a new file, newuser.dat, will be creted as /share/apps/configrepo/users_data/newuser.dat. In the file newuser.dat, the following 1-line information about the will be created, in the format $index $user $uid $passwd For example, 19 mockuser1 1019 ds!Jw3QXZ Note that the value of $index is immaterial. The password is generated automatically. The user suggested in the prompt will be subjected to an automatic check against existing usernames that have already existed in the cluster. In case the username suggested has already been taken, the addition of the user will be rejected. A new username need to be suggested and retry /share/apps/local/bin/coc-add_new_user. (76) The script /share/apps/local/bin/coc-add_new_user will attempt to first create the new user in the frontend, and then ssh into each node in turn for creating the new user there. The process will be fully automatic. The password of the user added is generated randomly and is archieved in /share/apps/configrepo/users_data/userpass_xx.dat. The permission of the directory /share/apps/configrepo/users_data/ is 700, hence can only be accessed by the root. A copy of the password for the new user is also kept in $HOME/passwd.dat. The permission of the file is set to 500,so that other group members or users cannot read this file. (78) Towards the end of the /share/apps/local/bin/coc-add_new_user script, the script coc-pwlssh2 will be invoked automatically. This will create passwordless ssh into each node to and fro the frontend for the $user. The script will also customize the .bashrc file for the user $user. (80) After the script coc-add_new_user is done, check indeed passwordless ssh has been achieved by the user $user by performing some trial ssh stunt, e.g., ssh c-21 to and fro the frontend. End of Adding an user to the cluster ------------------------------------------ (90) If a new node is later added to the cluster, perform the customization (52,64,66). (92) To remove a named user globally from the cluster, issue the command coc-remove-users (95) Administrator can also exectue the following scripts in tendem from time to time to tidy up all inconsistency in the clusters (as su in the frontend) coc-pwlssh-root coc-sync-users coc-sync-passwd coc-sync-pwlssh (96) The functions of coc-sync-users incldue, (a) Users will be added to a node if they exist in the frontend but not in the nodes. However, the password of the added users in a local node has no password. The password has to be added manually by root. (b) The uid of the users in the nodes shall be made to match with that in the frontend if they are not. (c) The users exist in a local node but not in the frontend will not be added to the frontend. These local users will be referred to as the 'orphaned' users. If the uids of these 'orphaned' users clash with any uid in the frontend, the uid of the 'orphaned' users in the local nodes will be modified to avoid clashes with uid in the frontned. Yoon Tiem Leong Universiti Sains Malaysia 11800 USM Penang, Malaysia updated 13 Jan 2020