Instruction manual to configure a Rocks Clusters (x86_64) 5.3

Proven to work on 20 PCs Acer Veriton i5 quaocore in school of physics, USM


Useful IP setting info: The following info will be used when first installing a Rocks cluster (Only relevant if using USMnet):

i)
Primary DNS: 10.202.1.27;
Secondary DNS: 10.205.19.207;
Gateway: 10.205.19.254;
Public Netmask 255.255.254.0 (alternatively 255.255.255.0).


ii) When installing frontend:
Private IP is 10.1.1.1.
Private netmask = 255.255.0.0.
Public netmask = 255.255.255.0 (alternativelty 255.255.254.0).


iii) Some useful IP:

comsics.usm.my, ip: 10.205.19.208
www2.fizik.usm.my, ip: 10.205.19.205
comsics28, 10.205.19.28
anicca, 10.205.19.225


(iv) One must know the IP address and the name of the cluster. These info have to be provided by the USM PPKT network adminstrator.
Request for a named DNS and IP address can be done at Server Registration at
http://infodesk.usm.my/infodesk/login.php.

(v) It is said that if PPKT, which is in charge of USM net's IP addreses, and running the DNS servers for USM net, does not register the Rock Cluster's hostname and IP address of the cluster, the nodes may not find the frontend. This is so because the nodes find the frontend via the DNS servers. This implies that the Rocks cluster may fail to be configured if no IP address and hostname (or known as "DNS name") is registered with PPKT. But the techincal detail of this statement is yet to be certified.

###################################################################################################

Installation procedure:


0. Assures that the boot sequence of ALL PCs are set such that CD/DVDROM is in the first boot up sequence.


1. Networking requirement: A PC used as a frontend must have two network cards: one built-in and another plug-in. The plug-in LAN card must be of the spec of 10/100/1000 Mb/s. Connect from internet line a LAN wire into the built-in LAN port. This is identified as the eht1 port. The eth0 port (plug-in LAN card) is connected to a LAN switch (12 holes, 24 holes, 36 holes ... one, depend on how many compute node is in the cluster) that must support 10/100/1000 Mb/s. Other PCs, which are to be used as compute nodes, must at least have a built-in 10/100/1000 Mb/s LAN port. The compute nodes LAN port is to be indentified as the eth0 port for that node. The compute node PCs do not necessarily have to have extra LAN cards. When installing a compute node, it must be connected from its eth0 port to the LAN switch using LAN cables. No LAN wire shall be connected from a compute node directly to the internet network. 


2. Initially, assures that a LAN cable is connecting the frontend to a internet network (that links to the outside word) at its LAN port eth1. Do not connect the frontend to the switch at this stage. Switch on only the power for the frontend PC. Leave the compute nodes' power off. Slot in the Rocks Cluster DVD installation disk, and type 'build' when prompted by the screen. Warning: this must be done fast or else it would automatically proceed to compute node installation mode instead of that for a frontend. 


3. When prompted, fill in the name, some miscillaneous info of the cluster and the IP details as mentioned in 'Useful IP setting info' mentioned above. Choose automatic partition if do not wish to customise the partition. If customised partitioning is required, suggest to use the following partitioning allocation:


SWAP       :     3GB

/var           :     12GB

/boot         :     100 MB

/                :     24 GB (or larger if wised)

/state/partition1        :    Maximally allowed


4. The installation will take place automatically once partitioning begins. Make sure to take out the DVD when it is ejected after about 20 - 30 min when Rocks is first succesfully installed. Failure to retrive the installtion disk from the CD drive renders repeated isntallation of infinte times.


5. Rocks will reboot when finish installing for the the first time. The first screen may see black screen with some warning sign because the PC may not has NVDIA GPU installed. Simply press 'enter' when promted so that the frontend fixes the problem automatically by installing a generic display driver. A GUI shall be displayed after a step times pressing enter.


Step-by-step procedure to follow:

1. When a Rocks Linux frontend is already up, in the frontend,

****************************************************************************************************************

cd /root

wget http://www2.fizik.usm.my/configrepo/fpatch1.conf

chmod +x fpatch1.conf

./fpatch1.conf

****************************************************************************************************************

A folder /share/apps/configrepo will be created, and all the content in

http://www2.fizik.usm.my/home/tlyoon/repo/configrepo will be copied there.

From within fpatch1.conf, cpatchfe.conf will be called. As a result, a folder /root/configrepo will be created. All the installation activities will be launched from /root/configrepo.


Standby while fpatch1.conf runs.


fpatch1.conf should complete withini a short while.


After/while fpatch1.conf is (being) executed, install mathematica by running the code

***********************************************

/share/apps/configrepo/mathematica1.conf

***********************************************



2. Right after fpatch1.conf is launch, do the following (while fpatch.conf is still running):

Connect the frontend and all the compute nodes to the LAN switch via their eth0 ports. Of course, assure the power supply to the LAN switch is ON

In the frontend's terminal,

***********************************************

insert-ethers

***********************************************

When prompted, choose `Compute'. Manually slot in a Rocks Cluster installation DVD into the individial PC. They will be detected if the LAN wires are connected properly into the LAN switch via eth0. Warning: insert-ethers ONLY after fpatch1.conf is fired. It is not necessary to wait until fpatch1.conf to completes before begin insert-ethers. insert-ethers before firing fpatch1.conf will result in the compute nodes not having any GUI.


3. Following is a repetative manual procedure. After step 1, ssh -X -Y into each compute nodes as root to do the following (alternatively, log in to each compute node physically):


In frontend, open new terminal using Ctrl + Shft + t repeatedly for 20 tabs.In each terminal, paste 'ssh -X -Y compute-0-'. After opening all 20 tabs, go to each of them one by one to ssh into compute-0-0, compute-0-1, ..., compute-0-19. Once in a compute node, paste the command line in compute nodes' terminal the following:

****************************************************************************************************************

sh /share/apps/configrepo/cpatchcn.conf nohup

sh /share/apps/configrepo/mathematica1.conf

****************************************************************************************************************

For details see cpatchcn.conf. Mathematica installation in mathematica1.conf is a  manual processes. Hence need to standby the screen.


Can try an alternative method instead of the one described above:

****************************************************************************************************************

rocks run host 'sh /share/apps/configrepo/cpatchcn.conf'

rocks run host 'sh /share/apps/configrepo/mathematica1.conf'

****************************************************************************************************************


4. Step 4 can be run concurrently with step 3. It is not necessary to wait until step 3 complete to carry out step 4. For step 4, do the following in the frontend:

****************************************************************************************************************

su -

cd /root/configrepo

sh /share/apps/configrepo/movevbfe.conf

****************************************************************************************************************

This will copy the virtualwindows into the appropriate location in frontend. For details see movevbfe.conf.

Standby while launching sh /share/apps/configrepo/movevbfe.conf


5. Do this in the frontend ONLY after step 3 has been completed:

****************************************************************************************************************

su -

rocks run host 'sh /share/apps/configrepo/movevbcn.conf'

****************************************************************************************************************

This will copy the virtualwindows from frontend into the appropriate location in compute nodes. For details see movevbcn.conf. This will take a long time to complete.


6. This is an optional step, depending on whether ATI display driver is to be installed in the compute nodes. This is necessary for the Acer comptuers in the computer lab, School of Physics USM. Otherwise, skip it. For this procedure, you must physically log into each compute nodes to do the following:


Log in to each compute node physically (not via ssh):

****************************************************************************************************************

su -

cd /root/configrepo

sh /share/apps/configrepo/ati.conf

****************************************************************************************************************

This will install ati display driver and mathematica in each compute node.


Reboot all compute nodes after all configuration using the command (in frontend)

****************************************************************************************************************

rocks run host reboot

****************************************************************************************************************