Remote Display on FEP / Worker Nodes
- Details
- Hits: 7015
Many debugging programs and profile apps require remote display. This becomes an issue in large clusters. There is usually one entry point in this case fep.grid.pub.ro (currently fep-53-3.grid.pub.ro). To achieve this on the NCIT Cluster we have a two stage process:
- get remote desktop to fep working
- get the display from the worker node to fep
Step 1:
- install NX client from NoMachine (http://www.nomachine.com/download-client-windows.php) . You should also install all Fonts extenstions. Many programs need this.
- FEP uses at maximum ICEWM desktop manager, but we require users to only open an xterm session.
Step-by-step configuration of NX Client
Start -> NX Connection Wizard -> Next
Session: [NCIT] Fep
Host: fep-53-3.grid.pub.ro
Next
Select Unix - Custom - Settings ...
Run the console
Floating window
Ok -> Next
Show Advanced Configuration dialog -> Finish
Now, select Key ... -> Click Default -> Save -> Save -> Ok
You can now login: username / password is from curs.cs.pub.ro
Step 2:
Now you must forward the display from the cluster machine to fep. Suppose the machine is named quad-wn16. On fep, record the display variable and run xhost + to allow incomming X connections:
[alexandru.herisanu@fep-53-3 ~]$ echo $DISPLAY
:1000.0
[alexandru.herisanu@fep-53-3 ~]$ xhost +
access control disabled, clients can connect from any host
[alexandru.herisanu@fep-53-3 ~]$
Now, from the remote machine (script will be run by means of qsub), you only have to set the DISPLAY variable to fep and you screen like this and run your program:
[alexandru.herisanu@quad-wn16 ~]$ export DISPLAY=fep-53-3.grid.pub.ro:1000.0
[alexandru.herisanu@quad-wn16 ~]$ xclock
Done.
SSH keys and passwordless authentication between worker nodes
- Details
- Hits: 10939
Hmm, this might be handy: how to get a paswordless authentication between worker nodes.
Why? Say you want to copy files from an MPI head-node on the other nodes, or you have a server (example debugging) that needs to connect to all MPI nodes without using a password. (and you do not have the right to configure your queue)
The commands are pretty basic: ssh-keygen. If you really wish, you can restrict its use only in this cluster by using from="172.16.*.*" at the beginning of the line like this:
[heri@fep-53-2 ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/export/home/heri/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /export/home/heri/.ssh/id_rsa.
Your public key has been saved in /export/home/heri/.ssh/id_rsa.pub.
The key fingerprint is:
ba:a8:bd:28:05:28:3a:0b:44:27:8a:d4:0b:c3:df:35
This email address is being protected from spambots. You need JavaScript enabled to view it.
[heri@fep-53-2 ~]$ echo -ne "from=\"172.16.*.*\" " >> ~/.ssh/authorized_keys2
[heri@fep-53-2 ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys2
[heri@fep-53-2 ~]$ chmod 600 ~/.ssh/authorized_keys2
So ... let's try it out. Remember, it will be the first authentication, so you do not have a known_hosts entry.
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/id_rsa MY_HOST
Does the trick. You may get your list of nodes from the $PE_HOSTFILE variable.
[heri@fep-53-2 ~]$ qsub -q ibm-quad.q -pe openmpi*1 2 -S /bin/bash
echo "File that contains all nodes and slots: [$PE_HOSTFILE]"
export MYNODES=`cat $PE_HOSTFILE | cut -f 1 -d ' '`
for x in $MYNODES;
do
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/id_rsa $x "echo Hi my name is `hostname`!"
done
Your job 94760 ("STDIN") has been submitted
[heri@quad-wn07 ~]$ cat STDIN.*94760
Warning: Permanently added 'quad-wn26.grid.pub.ro,172.16.4.26' (RSA) to the list of known hosts.
Warning: Permanently added 'quad-wn12.grid.pub.ro,172.16.4.12' (RSA) to the list of known hosts.
File that contains all nodes and slots: [/opt/n1sge6/sge-6.2u3/NCitCluster/spool/quad-wn26/active_jobs/94760.1/pe_hostfile]
Hi my name is quad-wn26.grid.pub.ro!
Hi my name is quad-wn26.grid.pub.ro!
Quickstart
- Details
- Hits: 5676
Daca un job ocupa un singur nod, acesta se ruleaza cu comanda:
qsub –q numele_cozii –cwd nume_script
Aceasta comanda va rula script-ul nume_script pe o singura masina din grupul numele_cozii iar directorul din care se va rula scriptul, este directorul curent.
Cozi disponibile
Nr. Crt |
Nume coada |
Nr noduri |
Procesor |
Frecventa |
Chip-uri |
Core-uri |
Memorie |
1 |
fs-p4.q * |
56 |
Pentium 4 |
3 Ghz |
1 |
1 |
2Gb |
2 |
fs-dual.q * |
32 |
Intel Xeon |
3 Ghz |
2 |
1 |
2Gb |
3 |
ibm-quad.q |
28 |
Intel Xeon |
2 Ghz |
2 |
4 |
16Gb |
4 |
ibm-opteron.q |
2 |
AMD Opteron |
2,55 Ghz |
2 |
6 |
16Gb |
* aceasta coada nu e disponibila studentilor prin SGE
In cazul in care dorim sa rulam pe mai multe core-uri sau pe mai mule noduri trebuie sa informam sistemul de batch despre acest lucru.
qsub –q numele_cozii –pe nume_environment [nr sloturi] –cwd nume_script
Specificarea unui parallel environment este modul de a anunta sistemul de batch Sun Grid Engine, ca trebuie sa ne acorde mai mult de un singur nod de rulare. Un Paralell environment este definit print-un nume si un numar de noduri cerut.
Numele acestor medii de lucru MPI este definit la nivelul clusterului astfel:
Nr. crt |
Nume |
Sloturi maxime |
Nr. Demoni per host |
Politica de scheduling |
Vers. MPI |
1 |
openmpi |
224 |
8 |
pe_slots |
Openmpi 1.2.3 |
2 |
openmpi*1 |
28 |
1 |
|
Openmpi 1.2.3 |
3 |
intelmpi |
224 |
8 |
pe_slots |
Intel MPI |
4 |
Intelmpi*1 |
28 |
1 |
|
Intel MPI |
5 |
hpmpi |
224 |
8 |
fill_up |
HP MPI |
C/C++ compilation on Windows
- Details
- Hits: 15481
There are multiple posibilities to develop C/C++ applications on windows and then running them on the cluster. The preferred method is using an IDE (Eclipse http://www.eclipse.org or Netbeans http://www.netbeans.org) locally and transfer your project to the cluster.
It is hard to do this when you can not compile your program using GCC.
You should install CYGWIN or MINGW for that.