starting out with pdfs, MPI and kernel_example.m!

View: New views
6 Messages — Rating Filter:   Alert me  

starting out with pdfs, MPI and kernel_example.m!

by calcpage :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanx for all the input!  Wow, I must say, you have been the most helpful linux distro author I've ever encountered!  I really appreciate your help and my students will too!

OK, so I found your presentations page.  I was looking at http://www.pelicanhpc.org, but you meant http://idea.uab.es/mcreel, right?  I always used to go to http://idea.uab.es/mcreel/ParallelKnoppix/ but not lately, so I forgot that address.  I'm good now...

Now that I've booted n0-n3 I see what you mean by a single user system.  This is the first time I ever got MPI working at school, so I don't know, is this the usual setup for MPI?  I'm used to other clustering solutions, such as openMOSIX, where any node can be used to start a job.  I see the compute nodes have a login screen, can anyone login there using the passwd I created at boot up to submit jobs?  I suppose they would actually be controlling the master node remotely and perhaps lag a bit?

I had a problem with kernel_example.  I got my 4 nodes up and ran kernel_example with no problem on the master node and saw all 3 graphs.  My tic/tocs were 0.499, 0.070 and 3.335 respectively.  Then I editted kernel_example to increase the number of compute nodes.  No mater what I enter other than 0 (1, 2 or 3) octave hangs right after the first tic/toc and before the first plot.  So, I'm not getting the compute nodes to work properly although they seem to have booted correctly as per your tutorial and ~/etc/bhosts lists them all....

Also, I'm wondering how octave knows where kernel_example.m is as it does not seem to mater what directory I'm in before I invoke kernel_example from octave and it still works fine (on n0 only)....

TIA,
AJG
A. Jorge Garcia
Teacher and Professor
Math, Physics and CompSci
Baldwin SHS and Nassau CC
mailto:calcpage@aol.com
http://calcpage.tripod.com/shadowfax

Re: starting out with pdfs, MPI and kernel_example.m!

by calcpage :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

OK, I just got 12 compute nodes up and running fine!  I found that I had to execute "lamboot" on the master node before running kernel_example.m if I set more than 0 compute nodes in octave.  Is the master node used for computing too, so all 26 cores are being used, or only 24?

I see you have 4 nodes with 32 cores?  Wow, are these 64bit cores?  How fast are they?  How does LinPack rate your new cluster?  What was your old cluster spec?

TIA,
AJG
A. Jorge Garcia
Teacher and Professor
Math, Physics and CompSci
Baldwin SHS and Nassau CC
mailto:calcpage@aol.com
http://calcpage.tripod.com/shadowfax

Re: starting out with pdfs, MPI and kernel_example.m!

by Michael Creel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

If you had to manually lamboot, then something went wrong. The pelican_setup script lamboots for you, and kernel_example should run without problems with any number of MPI ranks. You can run "pelican_restarthpc" at any time to determine how many compute nodes are ready, and to lamboot them.

You can log into the compute nodes with the username "user" and the password that you set. There is really no reason to do so, though. As a single user tool, access through the frontend node is sufficient.

The way MPI ranks are run on the nodes of the cluster depends on how things are set up, and on the application. There is a lot of flexibility here, and much of the fun of parallel programming is figuring out how to make things run efficiently. There is a lot to learn here.

My research cluster is 64 bit. These days, it doesn't make sense to use anything else for number crunching, if you're buying new machines. I don't know the linpack specs for it, haven't tried.


Re: starting out with pdfs, MPI and kernel_example.m!

by calcpage () :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

BTW, not only did I have to lamboot, but I could not get the cluster to work on eth1 for my life!  You can specify which ethernet card to use on the master but not on the compute nodes.  

When the master is on eth1 the compute nodes can't find it if they are on eth0.  I tried pulling the cables out of the compute nodes to force them to use eth1 but then they got stuck during bootup at the line setting up ipconfig on eth0!  

So, don't be surprised when the tech guys come to my door with pitch forks...

Thanx,
AJG
A. Jorge Garcia
Teacher and Professor
Math, Physics and CompSci
Baldwin SHS and Nassau CC
mailto:calcpage@aol.com
http://calcpage.tripod.com/shadowfax

Re: starting out with pdfs, MPI and kernel_example.m!

by calcpage :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

BTW, I must have major lag on this LAN.  As I add on more and more nodes to run kernel_example, I see a little improvement, but then I loose that improvement when I add many nodes....  So much for gigabit switched ethernet!

Regards,
AJG
A. Jorge Garcia
Teacher and Professor
Math, Physics and CompSci
Baldwin SHS and Nassau CC
mailto:calcpage@aol.com
http://calcpage.tripod.com/shadowfax

Re: starting out with pdfs, MPI and kernel_example.m!

by Michael Creel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Note that you need to edit kernel_example to make it use more nodes. Also, after a certain point there is no speedup, for a set number of data points. With a larger number of data points you can get a good speedup with a larger cluster. See my paper in Computational Economics for some more information on that. http://ideas.repec.org/a/kap/compec/v26y2005i2p107-128.html