BUILD YOUR OWN CENTOS COMPUTER CLUSTER ====================================== This document contains the instructions to build a cluster of PC running CENTOS Linux system that has network sharing file (NSF). The modus operandi of the cluster mimics that of the Rocks Cluster (see http://www.rocksclusters.org/). The cluster contains a frontend node plus a number of compute nodes. All nodes are connected via a 100/1000 MB/s switch. In principle this document can also be applied to other Linux OS with or without modification. The 'cluster' is not one that is defined in a strict sense (like the Rocks Clusters), as our's is much less in complexity. If you want to use this manual to set up your own CENTOS cluster in your institution, you need to modify only the following information which is specific to your network enviroment. The default values used in this document are as below. You will need these information to build the cluster, namely: DOMAIN, IPADDR, GATEWAY, DNS1 and DNS2 (DNS2 is optional). Ask your network admin for these information. For my case, these are HOSTNAME=chakra IPADDR=10.205.18.133 DOMAIN=usm.my GATEWAY=10.205.19.254 DNS1=10.202.1.27 DNS2=10.202.1.6 ### DNS2 is optional ====================== Hardware requirements: ====================== i). One frontend PC + at least one compute node PC. ii). Two hardisks on frontend (one small capacity and the other large, e.g., 500 GiB + 1 TiB). iii). LAN cables, a 100/1000 MB/s switch. iv). The frontend node has to be equipped with two network cards, a built-in and an externally plug-in one. v). All compute nodes must be equipped with a minimum of one network card (either built-in or external). Naming convention: We shall denote the built-in network card eth0 while the external network card eth1. eth0 is the network card that connects to the swtich, while eth1 the network card that connects to the internet. Note that is not material to stick strictly to these convention. The convention is merely for the sake of naming consistency. Important IPs to take note: The IP for the frontend at eth0 is by default set to: 192.168.1.10 The IPs for node1, node2, node3 etc. at their respective eth0 are by default set to (in sequential order): 192.168.1.21, 192.168.1.22, 192.168.1.23, ... ========================================================================================== To build a CentOS cluster, follow the following procedure step-by-step in sequential order. ========================================================================================== (2) Use Rufus or other software to burn the lastest CentOS iso into a bootable thumb drive. This manuscript is prepared based on CentOS-7-x86_64-DVD-1804, but in principle it should also work for other version of CENTOS (version 6 or above). (4) Connect all eth0 ports of all the PCs in the cluster to a 100/1000 MB/s switch. eth1 of the frontend is to be connected to the internet network (e.g., in USM, it is the usm.my network). (6) Install CENTOS using the bootable thumb drive into the frontend's smaller hardisk. Leave the larger hardisk as it is for the moment. For the sake of uniformity, you should use the installation option of 'Development and Creative Workstation'. Choose to add all software packages offered at the installation prompt. Use a common root password for the frontend as well as all compute nodes. (8) Check out the label for the hard disks in the frontend using fdisk -l. Say the larger hard disk is labelled /dev/sdb. Mount this hard disk in the folder /export in the root directory in the frontend CENTOS by typing the following command line (as su) in the terminal: mkdir /export chmod -R 777 /export mount -t xfs /dev/sdb /export cp /etc/fstab /etc/fstab.orig Mount this hardisk permanantlty by adding the line /dev/sdb /export xfs rw 2 2 to /etc/fstab. The permanant mounting of /dev/sdb to /export will take place after a reboot. In case the hard disk is not formatted properly, it may refuse to be mounted. The hard disk can be formatted using mkfs.xfs -f /dev/sdb to forcefully format it into XFS format. The task of formatting and mouting the external hard disk to the Centos root directory is well described in the following executable instruction mount_export.txt downloadable from http://anicca.usm.my/tlyoon/configrepo/howto/customise_centos/Centos_cluster/mount_export.txt (9) After the frontend is up, use the NetworkManager GUI to establish an connection to the internet via the eth1 network card. This is a manual procedure. NetworkManager can be found in the following manner: Settings -> Network. Under the 'Wired' panel, click on the network card item. In case you don't see the NetworkManager icon (which could happen when a fresh copy of CENTOS is just set up), issue the command 'service NetworkManager restart' or 'service NetworkManager restart' (try out which one works) in the terminal to launch it. ============================================================= (10) Download the following script into /root/ ============================================================= http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/frontend-1of3.txt Customize the following variables in centos_frontend-1of3.txt script to suit your case: HOSTNAME= IPADDR= DOMAIN= GATEWAY= DNS1= DNS2= Then chmod +x sh frontend-1of3.txt ./frontend-1of3.txt in the frontend as su. frontend-1of3.txt will do the necessary preparatory configuration for the frontend. These include (i) setting the SELINUX to permissive in /etc/sysconfig/selinux, (ii) activating sshd.service so that ssh can work, and (iii) setting the IP of the frontend by making changes to its network configuration such as assigning IP address, setting up DNS server, etc. (The instruction is based on http://www.techkaki.com/2011/08/how-to-configure-static-ip-address-on-centos-6/). (iv) create /state/partition1. (v) etc. The frontend will reboot at the end of frontend-1of3.txt. (12) After rebooting the frontend, log in as root. Check the frontend to see if the following functions are configured successfully by frontend-1of3.txt. (a) Check the mode of SELINUX using the follwing commands: getenforce sestatus The SELINUX should be set to permissive. This can be independently confirmed by checking if the file /etc/sysconfig/selinux has a line stating 'SELINUX=permissive'. (b) Check manually if both eth0 and eth1 are connected to the switch and internet respectively. This can be done by issuing the following commands one-by-one (i) ping google.com (ii) ssh into the frontend from a third party terminal (iii) ssh into a third party terminal from the frontend (iv) ifconfig (v) cat /etc/sysconfig/network-scripts/ifcfg-$nc0 (vi) cat /etc/sysconfig/network-scripts/ifcfg-$nc1 where nc0=$(lshw -class network | grep -E 'Ethernet interface|logical name' | grep 'logical name' | awk '{print $3}' | awk '!/vir/' | tail -n1) nc1=$(lshw -class network | grep -E 'Ethernet interface|logical name' | grep 'logical name' | awk '{print $3}' | awk '!/vir/' | awk 'NR==1{print}') (14) Check if the HWADDR for both $nc0 and $nc1 are explicitly specified. Assure that (a) DOMAIN=local, DNS1=127.0.0.1 for the network card $nc0 (=eth0) (b) DOMAIN=xxx, DNS1=xxx, DNS2=xxx for network card $nc1 (=eth1) where xxx are the values for DOMAIN, DNS1, DNS2 for $nc1 you have set in frontend-1of3.txt. The setting of the network configuration as attempted to be established by the frontend-1of3.txt script may fail to work. In such as case, the network configuration can be manually configured by tweaking the NetworkManager. (15) Alternatively, the setting of the required network configuration (as stated in item (14)) can be manually configured, if needed, by editing the files /etc/sysconfig/network-scripts/ifcfg-$nc0 /etc/sysconfig/network-scripts/ifcfg-$nc1 However, this is an alternative option that may result in unexpected glitches. Avoid doing it if you don't know how to do it exactly. (16) In any case, both network cards must be active and connected before proceeding to the next step. Reboot the frontend if you have done some manual configurations. Often both cards will be connected after rebooting. Be remined that the network card $nc0 (=eth0) is to be connected to the switch, while $nc1(=eth1) to the internet in our convention. =========================================================== (18) Download and execute the following script in /root/ =========================================================== wget http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/frontend-2of3.txt chmod +x frontend-2of3.txt ./frontend-2of3.txt CentOS will reboot at the end of the script. All the steps described until this stage must be completed before initiating the following steps. =========================================================================== (20) Download and execute the following script into /share/apps/local/bin =========================================================================== cd /share/apps/local/bin wget http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/frontend-3of3.txt chmod +x frontend-3of3.txt ./frontend-3of3.txt This will download all required scripts into /share/apps/local/bin. This finishes the part on setting up the frontend of the cluster. Now proceed to the setting up of the nodes. =========================== Install CENTOS in each node =========================== (22) Install CENTOS in all other compute nodes. All nodes should use the same installation options as that used in setting up the frontend. (24) During the installation of a node, physically connect the network card of the node (which is referred to as 'eth0' here; by default we choose the built-in/internal network card as the eth0) to the switch. In addition, DO NOT attempt to set the network cards' setting during the installation, nor name the node. (25) Note that if a node is equipped with two network cards, one will be used to connect to the switch (eth0) and the other (eth1) internet-accssing network. Having two network cards in a node is optional but preferred. If only one network card is present, it should be connected to the switch. This card would be identified as the eth0, and there will be no eth1 card. Execute the coc-nodes-v2 script ------------------------------- (26) Note that at this point, the node is not yet establish a ssh connection to the frontend or the internet. Save the script http://anicca.usm.my/configrepo/howto/customise_centos/Centos_cluster/coc-nodes-v2 to a pendrive. Copy the file into the /root directory (or anywhere) of the freshly installed node. The major functions of the coc-nodes-v2 script, among others, include (27) Run the script: chmod +x coc-nodes-v2 ./coc-nodes-v2 Among others, the function of the script coc-nodes-v2 include: a) assigns one of the network cards as eth0 and the other as eth1. Manually check that after the coc-sync-users script execution, the node is connected to the frontend as well as the internet. If no connection to both is establed, it may be due to the wrong guess of the identities of eth0 and eth1 by the coc-sync-users script. If this is the case, manually swap the LAN cables to both network cards to see if the expected connection can be established. You may have to issue the command service network restart after swapping the LAN cables of the network cards. b) mount the node to the frontend as a nsf node, setting the hostname of the node as 'compute-0-$ipaddlastnumber', alias c-$ipaddlastnumber. It will create a shared directory /share/ in the c-$ipaddlastnumber node, which is a nsf directory physically kept in the hard disk of the frontend. It will set the hostname in /etc/hostname of the new node, as well as generating the necessary content in the .bashrc file for root in the node (via gen_bashrc-root). In addition, a local folder /state/partition1 will also be created in the node. c) set the following items in /etc/ssh/sshd_config and /etc/ssh/ssh_config is set to 'no', namely, GSSAPIAuthentication no (28) At the end of the execution of the coc-nodes-v2, you will be prompted to establish a passwordless ssh for the root from the current node -> the frontend (if ssh has been established). (28.5) After the completion of the execution of the coc-nodes-v2 script, the node will reboot. (29) If the coc-node-v2 script fails to establish connection to the frontend (and internet if there are two network cards), the connection has to be performed manually by following procedure (29.5) - (36.5) below. Skip procedures (29.5) - (36.5) if step (28) using the coc-nodes-v2 script is successful. Set up the IP addresses using NetworkManager GUI (optional, do it only if coc-eth0-node fails) ================================================ (29.5) Setting the IP address of the eth0 network card of the node to 192.168.1.$ipaddlastnumber for that particular node. This is done manually by using the NetworkManager GUI in CENTOS, see item (9) above. (30) While setting the IP of the network card in a node manually, you have a option to associate the MAC address with the Name of the network card in the 'Identity' tab. For example, you may see two items in the 'Wired' panel, where one of its 'Identity' is Name=enp8s0, MAC Address=70:85:C2:B6:1D:7E; while the other 'Wired' card has the 'Identity' Name=enp6s0, MAC Address=XX:XX:C2:B6:1D:XA. However, the association is optional and you can skip this step. (31) In the 'Details' tab, you can see which IP address is associated with the cards. (32) One network card must be manually chosen to be the 'eth0' card, which IP is set to the fixed address=192.168.1.$ipaddlastnumber. For example, choose the enp6s0 card as the 'eth0' port. Add the Address '192.168.1.$ipaddlastnumber in the IPv4 tab for this card. Set Netmask=255.255.0.0. GATEWAY=192.168.1.10; DNS1=192.168.1.10. Search= 'local' (34) The values of $ipaddlastnumber should be assigned systematically. Each node should has a unqie $ipaddlastnumber value, begins from the positive integer $ipaddlastnumber=21. For example, when installing the first node, set ipaddlastnumber=21 When installing the second node, set ipaddlastnumber=22. etc. (36) If there is another network card such as the enp8s0, you may manually fix its IP address in the similar manner as for the eth0 case, or choose DHCP to set the IP automatically. This will allow the node to have its own public IP address. This is an optional step. (36.5) Be warned that to get both network cards (or the eth0 if there is only one card available in a node) to succesfully connect to the switch and the internet (for the case of two network cards) using the NetworkManager GUI may require some manual tweaking. (### End of Setting up IP addresses using NetworkManager GUI (optional) ### ) (37) After rebooting a node, you will be forced to create a local user. For the sake of consistency, create a local user 'user'. But the naming of the user is immaterial. (38) The setting of the IP (using either the coc-eth0-node script or the NetworkManger GUI) can be checked by issuing ifconfig to assure that connection to the switch has been established. (39) In the case where the node has successfully established a connection to the frontend via the eth0 card, (a) the node should see a positive response when pinging the frontend from the terminal of the node: ping 192.168.1.10. (b) the node should connect to the internet if there are two network cards. (c) you should see similar output as in the following samples, eno1: flags=4163 mtu 1500 inet 10.205.19.29 netmask 255.255.254.0 broadcast 10.205.19.255 inet6 fe80::c179:dee0:8b19:51b0 prefixlen 64 scopeid 0x20 ether 54:04:a6:28:c8:0c txqueuelen 1000 (Ethernet) RX packets 34649849 bytes 3348276078 (3.1 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 411108 bytes 34124296 (32.5 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 18 memory 0xfb600000-fb620000 enp8s0: flags=4163 mtu 1500 inet 192.168.1.21 netmask 255.255.0.0 broadcast 192.168.255.255 inet6 fe80::a109:26ab:e943:65ae prefixlen 64 scopeid 0x20 ether 1c:af:f7:ed:32:d3 txqueuelen 1000 (Ethernet) RX packets 217611 bytes 179611464 (171.2 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 171507 bytes 17822576 (16.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 (40) Check that (i) indeed the node has the correct ipaddress, i.e., 192.168.1.$ipaddlastnumber, (ii) it can ssh into the frontend via ssh 192.168.1.10, (iii) the frontend can ssh into the node via ssh c-.$ipaddlastnumber or ssh compute-0-.$ipaddlastnumber. (iv) check also that ls -la /share exists, and see at least a directory /share/apps/. (v) try to copy a large file (~200 MB) to and fro the frontend and a node. If the eth0 connection via the swith works properly, the transfer rate of the file copying process should be of the order > 20 MB/s (or at least larger than ~12 MB/s). (vi) ssh other existing nodes via the command e.g., ssh c-21 or ssh 192.168.1.21 (assuming c-21 is not itself). (48) Repeat steps (22) - (47) to set up all other nodes. (50) If the nfs shared directory has been seen and accessible in each node, plus the nodes and the frontend can ssh among each other via the eth0-switch connections, the set-up of the CentOS cluster is considered done. ================================================================== Configure passwordless ssh for root from the frontend to all nodes ================================================================== (52) After all nodes have been installed, proceed to set up passwordless ssh for root from the frontend to all nodes. As su, execute the coc-pwlssh-root script in /share/apps/local/bin, coc-pwlssh-root The script will attempt to create passwordless ssh into each node to and fro the frontend for the root. After coc-pwlssh-root is done, check indeed passwordless ssh has been achieved by the root by performing some trial ssh stunt, e.g., ssh c-21 to and fro the frontend. ============================= Customization of the frontend ============================= Once a frontend is up and running, install the following three categories of packages step-by-step. (54) basic_packages, to be installed in the root directory. This is an brief, auto installation process. Issue the command: basic_packages.txt (56) CUDA packages, to be installed in the root directory. Manual responses are required when running this installation packages. Reboot required. cuda-package.txt (58) packages0, to be installed in the /share/apps directory. Manual responses are required when running this installation packages. Could be time-consuming since it involve large files. Issue the command: packages0.txt (60) packages1, to be installed in the /share/apps directory. Automated installation. Issue the command: packages1.txt Despite being time consuming, this is an auto installation process. ========================== Customization of the nodes ========================== Once a node is up and running, install the following three categories of packages step-by-step. (62) basic_packages, to be installed in the root directory. This is an brief, auto installation process. Issue the command: basic_packages.txt (64) CUDA packages, to be installed in the root directory. Manual responses are required when running this installation packages. Reboot required. cuda-package.txt (66) statepartition1_packages, to be installed in the /state/partition1 directory. Manual responses are required when running this installation packages. Could be time-consuming since it involve large files. Issue the command line: statepartition1_packages.txt ============= Maintainenece ============= Adding an user to the cluster ----------------------------- (72) To add a new user to the cluster, issue the following command in a terminal in the frontend as su: coc-add_new_user The root will be promtped for the username to be added. After providing the username to be added, a new file, newuser.dat, will be creted as /share/apps/configrepo/users_data/newuser.dat. In the file newuser.dat, the following 1-line information about the will be created, in the format $index $user $uid $passwd For example, 19 mockuser1 1019 ds!Jw3QXZ Note that the value of $index is immaterial. The password is generated automatically. The user suggested in the prompt will be subjected to an automatic check against existing usernames that have already existed in the cluster. In case the username suggested has already been taken, the addition of the user will be rejected. A new username need to be suggested and retry /share/apps/local/bin/coc-add_new_user. (76) The script /share/apps/local/bin/coc-add_new_user will attempt to first create the new user in the frontend, and then ssh into each node in turn for creating the new user there. The process will be fully automatic. The password of the user added is generated randomly and is archieved in /share/apps/configrepo/users_data/userpass_xx.dat. The permission of the directory /share/apps/configrepo/users_data/ is 700, hence can only be accessed by the root. A copy of the password for the new user is also kept in $HOME/passwd.dat. The permission of the file is set to 500,so that other group members or users cannot read this file. (78) Towards the end of the /share/apps/local/bin/coc-add_new_user script, the script coc-pwlssh2 will be invoked automatically. This will create passwordless ssh into each node to and fro the frontend for the $user. The script will also customize the .bashrc file for the user $user. (80) After the script coc-add_new_user is done, check indeed passwordless ssh has been achieved by the user $user by performing some trial ssh stunt, e.g., ssh c-21 to and fro the frontend. End of Adding an user to the cluster ------------------------------------------ (90) to remove a named user globally from the cluster, issue the command coc-remove-users (92) From time to time tidy up the inconsistency in the usernames and uids across the nodes in the cluster by issuing the command coc-sync-users The functions of coc-sync-users incldue, (a) Users will be added to a node if they exist in the frontend but not in the nodes. However, the password of the added users in a local node has no password. The password has to be added manually by root. (b) The uid of the users in the nodes shall be made to match with that in the frontend if they are not. (c) The users exist in a local node but not in the frontend will not be added to the frontend. These local users will be referred to as the 'orphaned' users. If the uids of these 'orphaned' users clash with any uid in the frontend, the uid of the 'orphaned' users in the local nodes will be modified to avoid clashes with uid in the frontned. (95) If a new node is later added to the cluster (where many users have already existed), the existing users in the cluster must be synced into the new node. To do so, issue the script as root, coc-after-newnode. The script coc-after-newnode contains three main component: (a) coc-pwlssh-root (b) coc-sync-users (c) coc-sync-passwd (c) coc-sync-pwlssh Yoon Tiem Leong Universiti Sains Malaysia 11800 USM Penang, Malaysia updated 3 Jan 2020