|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
password-less public-key authenticated sshHi All!
I just saw the write-up Pelican HPC got in Linux Pro magazine. Congrats! This article reminded me that I've always wanted to get Pelican to work on my cluster but had no luck! Here's my situation: (1) My students and I use Octave a lot, so to be able to set up a cluster with MPITB would be great! (2) We have password-less public-key authenticated ssh setup for a very simple cluster where we use bash scripts to scatter and gather fractal and povray renderings. (3) Whenever we try PelicanHPC (or prior incarnations such as parallelKNOPPIX) we always had problems PXE booting the worker nodes after booting the liveCD on the master node. There must be a conflicting PXE server somewhere on our LAN, but the tech guy at my school swears up and down that he's isolated my classroom/lab. I'm not convinced of this, however, as I've had other conflicts when using other liveCDs (BCCD?) that use say DHCP to set up MPI. Questions: (1) Can I get PelicanHPC to work with passwrordles ssh as I already have this setup on 25 AMD Athlon 64bit dualcores? (2) Can I setup a PelicanHPC cluster without using PXE, say by booting all nodes with the CD? As I understand it, MPI can be set to use ssh.... Thanx in advance, A. Jorge Garcia Teacher and Professor Math, Physics and Comp Sci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshI haven't seen that write-up. I hope it's accurate. PelicanHPC sets up a cluster with passwordless ssh, so the short answer is yes. I guess you want the software installed on PelicanHPC to work on your cluster. To get the Octave stuff working on your cluster, you could copy the stuff from PelicanHPC's /home/user onto your cluster. Then install LAM/MPI on your cluster. Then run "sh setup_econometrics" in the Econometrics subdir. That will compile MPITB against your versions of LAM/MPI and Octave. As long as the Econometrics subdir is on a NFS share, things should work. There are of course many bumps in the road upon which one might trip... No, this won't work. That worked with some versions of ParallelKnoppix, but it will not work with PelicanHPC. The nodes must netboot. You might try things out by just using a crossover cable to connect 2 machines, and pull out all the other cables. That way you know for sure that they're isolated, and this will let you at least see how PelicanHPC works. Cheers, Michael |
|
|
Re: password-less public-key authenticated sshI think I used parallelKNOPPIX in the past by simply booting several CDs. OK, maybe I'll try the cross-over cable. Do I just connect 2 ethernet cards from 2 distinct PCs with an ethernet cable or do I need a special cable or port? I'd like to get all 50 cores running, but this should get me 4.
I have the KNOPPIX DVD installed to an hdd partition. So my installation is very simple but it includes Octave. We simply added public key authentication to every node and some batch files to split up jobs. We do not have NFS running nor do I have MPI installed. Do you mean I can simply copy code from your CD and I should be able to get it to work on my setup? I've also had the QUANTION DVD installed in the past that included stuff from Octave-Forge, CRAN and openMOSIX. The article is similar to your online tutorial so it seems accurate as far as I can tell (not ever getting pelicanHPC to boot its hard for me to judge).... Thanx, A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshYes, ParallelKnoppix would let you do that, but PelicanHPC doesn't. To do what I say, you need a crossover cable, which is not an ordinary ethernet cable. They are available in most computer stores. The octave code will work on your setup, but to be able to compile MPITB and be able to use Octave in parallel, you need to install MPI. Without MPI, the setup_econometrics script will crash when it tries to compile MPITB. Ahh, ok. But the way, if you just start PelicanHPC on a single machine, and then type "lamboot" you will be able to use Octave on both cores. With a crossover cable you could get a second machine into the mix. For more than that I recommend trying to figure out what's up with netboot. When I need to set up a cluster when the normal network has a dhcp server that can't be turned off, I use my own cables and switch. Nothing like physical isolation to guarantee that you won't get interference. Cheers, M. |
|
|
Re: password-less public-key authenticated sshAll I have to add to my system to use you examples and scripts is MPI and MPITB? Where do I download these packages and how do I compile them. Do I need NFS too?
Sorry for being such a NOOB, but I've never been able to get MPI to work at my school.... I'd love to get this working for my students next year! TIA, A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated ssh If you can't figure out where to get those on your own, there's no way you'll be able to make them work! www.google.com is your friend. ![]() You need NFS in order to use more than one machine, but MPITB will also let you use both cores of a single machine. The portability of MPI from multi-core machines to clusters to supercomputers is one of its great advantages. |
|
|
Re: password-less public-key authenticated sshOOPS, sorry about that, I got a bit frustrated.... I'll have to go to my friendly neighborhood Radio Shack to get the crossover cables (a radio/computer/general technology hobbyist store we have here near New York City). This cable attaches to the Ethernet ports, right?
Anyway, I had a copy of your 64bit Pelican HPC 1.7 CD in my desk from when I last tried out your distro several months ago. So, I tried it out again. It seems to boot OK on one of my nodes (a PC with 64bit dual core AMD Athlons running at 2GHz per core). So, I tried to PXE boot again. I don't see the conflicting servers on my LAN anymore, so that's progress! Finally, the tech guy listened to me and got it right, I suppose! I'm confused about one thing. Node n0 sets up a static IP 10.11.12.1 and the pelican_start script then seems to look for nodes at 10.11.12.2, 10.11.12.3, etc. How do I get the worker nodes to have these IPs before the cluster is setup? If the worker nodes don't have these IPs, how will pelican_start find them with PXE? Confused, A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshNo problem. Attempting to get an MPI cluster working from scratch is going to be complicated, so you'll need to collect a lot of information to make it work.
The crossover cable looks just like a regular ethernet cable, but lets you directly connect 2 computers without putting a switch or router in between. Radio Shack will certainly have them. The frontend node acts as a DCHP server, and passes out IP addresses to any computer that asks for one. That's the reason why it is important to isolate the cluster from other networks. Otherwise, the PelicanHPC DHCP server will pass out 10.11.1.x addresses to computers that don't want them, and your compute nodes may get their IPs from the other server, instead of from the PelicanHPC frontend node. Don't worry about the worker nodes, just netboot them as described in the tutorial. As long as they are set to netboot (you might need to go into BIOS setup to enable that) things should be ok. Good luck! |
|
|
Re: password-less public-key authenticated sshOK, I wasn't sure if you'd know Radio Shack as you are in Barcelona, right?
Anyway, I did set to worker up to netboot on my gigabit switched ethernet LAN (no crossover cable) and node 0 seems to be assigning a ton of IPs in a row but my worker node is never found.... I guess I'll try to netboot with a crossover cable next. However, I really would like to get all 50 cores into a pelucanHPC cluster! Thanx for your help, A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshI am originally from the U.S., and I used to shop at Radio Shack for parts to build guitar pre-amps, etc.
Seeing the frontend scan over a number of IPs is normal, it's looking to see how many compute nodes have connected. Have you checked that the compute node is set so that network boot is enabled and has precedence over boot from hard disk? You might need to enter BIOS setup to set that, or hitting the F12 key while booting the compute node also works on many machines. |
|
|
Re: password-less public-key authenticated sshI did everything you said! OK, this is exactly what I did to try to netboot over the LAN (no crossover cable as of yet):
(1) boot master node with a 64bit pelicanHPC 1.7 CD (2) run pelican_start script and get up to the screen where you hit "yes" if all compute nodes are found or you hit "no" to find more compute nodes (0 compute nodes at this point) (3) set up BIOS for net boot on a compute node (4) initiate PXE boot on compute node (5) hit "no" several times on master node but still get 0 compute nodes! Sometimes, several IPs are listed on master node: something like master 10.11.12.1 cannot reach 10.11.12.2, 10.11.12.3, etc. That's it, I was trying to connect only 2 PCs over the LAN (one master node and one compute node)! However, eventaully the PXE boot times out on the compute node and it ends up booting the CD. To add insult to injury, not that it maters really, but both PCs are right next to eachother on the LAN and still cannot connect as a pelicanHPC cluster??? BTW, should I try the lastest version of your CD instead? I think that is version 1.8, right? Also, just curious, do you find 64bit performance significantly improved over 32bit? I know that a 64bit architecture will have larger maxint and memory address spaces, but is there a gain in thruput as well? TIA, A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshIt's a little hard to say what the problem is. Do your machines have multiple ethernet ports? If the compute node never even gets an IP address, then this suggests that there is some problem with the network connections. I recommend trying a crossover cable. That way, you will be sure that the network is not the problem. After that, the node should get an IP address, at least. If it doesn't, then I can't say what the problem is, perhaps your network card is exotic and is not recognized correctly. It wouldn't hurt to get the latest version of PelicanHPC, which is 1.9. There is some additional support for network cards that only have nonfree drivers, compared to v1.7. This is mostly untested, though.
|
|
|
Re: password-less public-key authenticated sshThanx for the input and your patience! I don't mean to be such a burden, but Octave + MPI + 50 cores would be great for my students!
Yes, we have 2 ethernet cards and I tried both. OK, I'll get the crossover cable and version 1.9 of pelicanHPC and try again next week in the lab. Thanx, A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshHi,
Note that when you have 2 ports on each machine, there are 4 combinations of ways to connect the cable to try out. The frontend node lets you choose the net device to use, but you need to know which port is eth0 and which is eth1. Determining that may most easily be done by experimentation. For the compute nodes, setting up netboot for both net devices should ensure that it will work on one of them. Good luck, I hope it works. I think you will find Octave with MPITB pretty neat if you can make it work. |
|
|
Re: password-less public-key authenticated sshI give you joy sir! A thing of beauty it is, this pelicanHPC CD! Success! I made a crossover cable and got 2 nodes running properly, finally!
There were 2 problems, obviously there's a problem on the LAN if I can't get this working without a crossover cable. Also, my isolated LAN is on eth1 and the DHCP service seems to be on eth0? I got kernel_example.m to run. Now I have to read up on more of your documentation to see what all your examples do and how to monitor activity on the nodes. For example, I don't know if it used one or two nodes, if one or two cores were used per core.... Well, that's progress anyway! Wow, lamboot, NFS and Octave+MPITB by default! I've died and gone to heaven! Thanx for your help, AJG A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshGreat! That means that your nodes do network boot and the hardware is supported by PelicanHPC. If a single node boots with a crossover cable, then the other nodes should boot over the LAN, as long as the LAN actually connects the frontend to the nodes and there is no interfering DHCP/tftp server on the LAN.
In principle, you seem to have selected the proper net device on the frontend, because the node booted. You should not need to worry about ethernet ports, because the setup you are using is working. Just repeat that with the other nodes, and make sure the LAN is isolated. Are you perhaps using a router that assigns IPs? DHCP running on anything other than the PelicanHPC frontend (router, other server, etc) can mess things up. |
|
|
Re: password-less public-key authenticated sshNow, the problem is that when I hook my LAN back up, eth0 is not isolated from the rest of the school. There's all kinds of conflicting ftp, dhcp, dns, you_name_it servers on there! Is there a way for pelican to default to eth1?
Yes, at first I did not have a crossover cable, so i tried to disconnect 2 or 3 nodes from the LAN and use my own router. However, the router itself had some sort of DHCP server running, so that didn't work either! I'm burning pelican 1.9 even as I write this. Let me try that. Thanx, AJG A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshIf the nodes of the cluster get IPs from other sources, there is no way you can make a cluster. In your first message you indicated that the LAN was isolated from other servers. If in fact it is not, then your problems are simply due to conflicts, and are entirely to be expected. You should also NOT use PelicanHPC on an open network, because the PelicanHPC DHCP server will mess up everyone else's ability to properly connect to their servers, and you will gain the ire and wrath of your network administrator.
![]() So, I'm pretty sure your troubles with setting up a large number of nodes are due to an open network. The fact that it worked with a crossover cable means that the machines play nice with PelicanHPC. If your lab is connected to the rest of the world with a single cable, just pull that cable and you should be in business. Good luck! |
|
|
Re: password-less public-key authenticated sshOK, the thing is this. My classroom has 25 PCs each with 2 ethernet cards. I set these PCs up to dual boot WIMxP and Linux. When a teacher comes in the class and has students login to WIMxP, they use eth0 and have access to school servers and the internet. When my kids come in, they boot Linux and then the PCs use eth1 which connects to the isolated network. So, can pelicanHPC be setup to use eth1 only?
As far as connecting to the rest of the world with eth0 over a single cable, I wish I knew where it was.... Thanx, AJG A. Jorge Garcia
Teacher and Professor Math, Physics and CompSci Baldwin SHS and Nassau CC mailto:calcpage@aol.com http://calcpage.tripod.com/shadowfax |
|
|
Re: password-less public-key authenticated sshIt's not possible to just pull out the cables on each PC that connect to eth0 and then re-connect them when you're done? If not, set your nodes so that they netboot only on the card that connects to the isolated LAN, and give that precedence over other boot methods. On the frontend node, you can select the net device to use for the cluster, when you see
![]() As long as that net device on the frontend node is on an isolated network, you will not interfere with the rest of the school's network. |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |