|
I am doing my research on cluster computing and my search for idle cluster environment ended when i found pelicanHPC.
Well i booted the image of the disk which in turn made it the server node Then i booted the other systems (2 PC) and the client nodes were detected. The client nodes also booted pelicanHPC (that was a gr8 experience) . Now , i would like to run MPI programs in the cluster environment , i don't know how to proceed , i would be glad if you guys can help me out. |
|
After writing a MPI programs you can compile them by using the following commands
For c mpicc hello.c –o hello For c++ mpiCC hello.cc –o hello For fortran mpiff hello.f –o hello For running a the MPI programs mpirun C hello The C option means “launch one copy of hello on every CPU that was listed in the boot schema. mpirun N hello N means “launch one copy of hello on every node in the LAM universe.” Hence, N disregards the CPU count. This can be useful for multithreaded MPI program mpirun –np 4 hello This runs 4 copies of hello. LAM will “schedule” how many copies of hello will be run in a roundrobin fashion on each node by how many CPUs were listed in the boot schema file. I have done a lame documentation regarding the building of a cluster and running MPI programes. You can donwload it from http://sudeepk.info/projects/clustering.pdf |
|
Administrator
|
In reply to this post by rahulr
Another option would be to try ParallelKnoppix, which is similar to PelicanHPC, but it also has a number of examples. The last version of PK is available at the PelicanHPC download page.
|
|
Hi Michael,
After setting up my pelican cluster 5 nodes, when i used htop, command at the frontend, i realise it shows me 1 running at the besides the tasks display location. should it be like this or it shud show me the number of compute nodes and frontnode therefore it shud be more than 1. i have written some MPI Programs in GNU nano editor, but when i tried to compile them with the following command: < mpicc clustering.c -o clustering>, the gcc returned a comment saying No such file/directory, then when i tried to run the code with the following command: < mpirun-np 5--hostfile/home/user/tmp/bhosts clustering>, the bash returns saying: command not found. i request for your assistance to get my code running on all the 5 processors clustering.c include <stdio.h> include <mpi.h> void main ( int argc, char argv[]) { int rank, size; MPI_Init ( &argc, &argv); MPI_COMM_CLUSTER; MPI_COMM_size ( MPI_COMM_CLUSTER, &size); MPI_COMM_rank ( MPI_COMM_CLUSTER, &rank); printf("Hello World from process of %d to %d/n", rank,size); MPI_Finalize(); } |
|
Administrator
|
Hi Joseph,
About the missing file problem, are you working from the same directory where you have the source code? It seems to me that you are not, and this explains why you get a compile error. About running the code, this will be impossible until you get it compiled. Once you have it compiled, you need to insert a space between the 5 and the -- in the following: mpirun-np 5--hostfile/home/user/tmp/bhosts... |
|
Hi Michael,
thanx a lot, i managed to trace the problem and got my code compiled successfully, but surprisingly when i tried to run it, it returned an error below that am researching on how to solve it, please advise. mpirun was unable to launch the specified application as it could not access or execute an executable: Executable: --hostfile/home/user/tmp/bhosts Node: pel1 while attempting to start process rank 0. i ran by using this command: mpirun -np 5 --hostfile/home/user/tmp/bhosts clusteringtwo |
|
You're almost there: just insert a space after "--hostfile"
|
|
Hi michael,
i ran my program but i got the following 3 errors: ORTE_ERROR_LOG, all not found in file for the following lines: 1. base/ras_base_allocate.c at line 236. 2. base/plm_base_launch_support.c at line 272. 3. plm_rsh_module.c at line 990. and the daemon(pid unknown died unexpectedly while attempting to launch, it was unable to find all the needed shared libraries on the remote node, thus MPIRUN aborted the job. from my findings, possibilities could be i have to set the location of the shared libraries on the remote nodes, to be automatically forwarded to the remote nodes. but how do i get this done successfully?, n could there be other possibile solutions. |
|
Administrator
|
Hmm, I don't see any more obvious problems. It seems to me like it should run. When you run
"cd /home/user; mpirun -np 3 --hostfile /home/user/tmp/bhosts flops" what do you see? |
|
i see the following:
Open RTE was unable to open the hostfile: home/user/tmp/bhosts Check to make sure the path and filename are correct. then followed by a display of the previous ORTE_ERROR_LOG as follows: 1. base/ras_base_allocate.c at line 236. 2. base/plm_base_launch_support.c at line 72. 3. plm_rsh_module.c at line 990. and the daemon(pid unknown died unexpectedly while attempting to launch, it was unable to find all the needed shared libraries on the remote node, thus MPIRUN aborted the job. from my findings, possibilities could be i have to set the location of the shared libraries on the remote nodes, to be automatically forwarded to the remote nodes. but how do i get this done successfully?, n could there be other possibile solutions. |
|
Administrator
|
Have you run pelican_setup? I guess so, if the nodes have booted. How about trying running pelican_restarthpc. That may help. What you report is a strange error that seems to indicate that something is messed up. You're using ver. 2.6, unmodified?
|
|
when i run pelican_restarthpc at command prompt, it returns bash: command not found. am using pelicanhpc version 2.2
am test running on my virtual machines cluster fast before i go to my physical machines cluster, i wonder whats messed up. could it be the OS or Something else? |
|
Administrator
|
I think that your virtual frontend must not have enough memory available to it. Can you give it 1GB? Then restart, I think that will solve the problems. Also, you might want to work with ver 2.6. Ver 2.2 is getting a little old now.
|
|
hi michael
i test run on my 5- 2GB RAM physical machines and it gave me the following HPC Test results: HPC Test Quantity of processors = 5 Calculation time = 0.64 seconds Cluster speed = 2821 MFLOPS Cluster node N00 speed = 564 MFLOPS Cluster node N01 speed = 564 MFLOPS Cluster node N02 speed = 565 MFLOPS Cluster node N03 speed = 564 MFLOPS Cluster node N04 speed = 565 MFLOPS -------------------------------------------------------------------------------------------------------- think problem was with the available RAM on my virtual frontend. However iam a bit lost on how to go to the nxt stage to get output for my compiled n run code, because the mpi code i wrote was intended for the client/compute processes to send a message to the server/frontend process, which will then print them out along with a message of its own. The program is using a common paradigm where one process acts as a "server process" and the rest as "client processes" so i expected to see some also other output on my screen, but only saw the above. |
|
In reply to this post by Michael Creel
this is the code i hv bn writng, compiled n run.// an mpi program for client processes to send a message to server process, which will then print them out along with a message of its own.
#include <stdio.h> #include <mpi.h> int main(int argc, char argv[]) { int size,rank; int length; char name[80]; MPI_COMM_WORLD; int i; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Comm_size(MPI_COMM_WORLD,&size); MPI_Get_processor_name(name,&length); if (rank==0) // Server node commands { printf("Hello MPI from the server process!\n"); for (i=1;i<size;i++) { MPI_Recv(name,80,MPI_CHAR,i,999,MPI_COMM_WORLD,&status); printf("Hello MPI!\n"); printf(" mesg from %d of %d on %s\n",i,size,name); } } else // client or compute node commands { MPI_Send(name,80,MPI_CHAR,0,999,MPI_COMM_WORLD); } MPI_Finalize(); return 0; } |
|
Administrator
|
What happens when you run this? It looks fine to me, I don't see any problems.
|
|
think it is ok not confusing anymore, i get the HPC Test output as i have shown before, and when i scale down the processors to 3 or 4 or 2, the calculation time increases and the cluster speed decreases as seen from the output of my frontend screen. i will try to run other programs have been writing and see how they respond. atleast now i know some imformation on how to handle mpi, THANX ALOT.
i also need to use octave, but am confused on how to start about it, it looks like its already installed, on the OS, or i have to do some installations on it, but i need to use it, but have no idea how to start. i request for your guidance on this. i also need to take screen shot, but i have tried to look for the menu to do this, i cant find unlike in ubuntu, where its at the applications > accessories > take screenshot. where can it be found? |
|
Administrator
|
For screenshots, I use ksnapshot. It's not installed on the images, but you can add it with
sudo apt-get install ksnapshot as long as your frontend has an internet connection. Regarding using octave, that's a big topic. Octave is more or less like Matlab. Check the documentation online, and there are some tutorials available. On PHPC, you can do mpirun -np 3 --hostfile /home/user/tmp/bhosts octave -q --eval "kernel_example(2000, true, false)" to see MPI and Octave in action. Learning what's going on behind that would take some time, though. |
|
hi Michael
thanx for the suggestion, i understand octave has MPI libraries, unfortunately today on several repeated searches on line to most links on line, i realise the mpi tool box for octave tutorials are inaccessible, it gives me search errors continously. i need these tutorials to learn how to use the libraries, then it will be easier to do bencmark tests and collect results, have any changes been made on this area, regarding availability of this information. pliz i request for advice and assistance on this. |
|
Administrator
|
Hi Joseph,
MPITB is no longer included with PelicanHPC, as the authors stopped developing it, and it does not work with recent versions of Octave. The tutorial is available on ParalellKnoppix, I think. ParallelKnoppix is no longer maintained, but the last image is available on the PelicanHPC web site. The MPI mechanism for Octave included on PelicanHPC is the openmpi_ext package, which is part of Octave Forge, with a page at SourceForge. There are no tutorials for this package, as far as I know, though the functions do give you help through the Octave help system. The code examples on PelicanHPC are a possibility. Sorry, but that's the state of things at the moment. |
| Powered by Nabble | See how NAML generates this page |
