Ray Van Dolson-3 wrote:
I'm the proud new owner of an IBM N3600 A20 (rebranded FAS2050C) with
20x30GB SAS disks.
I'm trying to determine the best way to get this thing set up, and
realized I have only a bit of a fuzzy understanding as to how the
clustering or failover filer head should work.
We have the exact same unit, except NetApp branded and with 147GB disks. No extra shelves at this stage either.
It does take a bit of time to get your head around how cluster configurations work with the NetApp filers if you've never dealt with them before. It sounds like quite a few of your questions have been answered already by others in this thread, but I thought it may be useful to add our experience and decisions on how we set things up.
Firstly, regarding clustering - the best way to think of the 2050C is two seperate filer heads that each have their own hardware, own disks, own spares, own OS, our config, etc, but are capable of "taking over" the disks, volumes, exports, IP addresses, etc, etc, from one another in the event that one of them dies.
So, as you've discovered, each filer does need to have dedicated disks assigned to it, you can't share network cards across the filers, etc, but you CAN expect them to seamlessly and automatically fail over to each other in the event of a problem (and this process works well - simulated it several times). The same fail-over functionality lets you non-disruptively upgrade to new Data ONTAP versions as well (we recently upgraded from 7.2.4L1 to 7.3.1 without any disruption to the data being served up to our VM hosts).
In our case, we decided to mimic an "active/passive" setup as closely as possible by allocating 17 disks to the first head (16 disk RAID-DP with 1 spare), and 3 disks (2 disk RAID-4 with 1 spare) to the second head. In this setup, all of our data and stored on and served from the first head - the second head does nothing except sit there watching for a failure of the first head, and takes over if this were to occur.
The logic behind this choice was:
- to maintain a reasonably good-practice setup (RAID-DP + hotspares) we would "lose" more disks by splitting them evenly across both heads than by putting as many disks as possible on one head - 9-disk RAID-DP group + 1 hotspare on each head is 6 disks "lost", whereas a 16-disk RAID-DP group + 1 hotspare & 2-disk RAID-4 group + 1 hotspare is 5 disks "lost",
- we could reduce this to 4 by removing the hotspare from the second head,
- gave us more active spindles (and hence higher performance),
- would be easier to expand from - e.g. if we bought a shelf of SATA disks it would be much easier to implement them efficiently
There are some cons - namely less usable cache available and no distribution of CPU load across the two heads - but for our workload we felt the pros outweighed these cons.
Also, to touch on your network cards - do you know that you need 2 Gbps of bandwidth, or just thinking the more the merrier? If it's the latter, you may want measure your network usage before going to great lengths to obtain that extra bandwidth.
We have each of the two NICs on each head in our 2050 connected to a different switch and then joined into a single-mode VIF, so we get switch fail-over with total bandwidth of 1Gbps per head. Average bandwidth usage on the main head is < 10MB/s while serving up VMDKs for 30-odd ESX VMs over NFS. Of course it does spike higher, but pretty much the only times I've seen it max out the network connection is when doing NDMP back-ups to a seperate backup proxy, and once or twice while doing mass-patching of our Virtual Machines during a maintenance window (so we had 30-odd Windows VMs all trying to install, say, .NET 3.5 at the same time).
Hope that helps! If you have any questions feel free to post them - I went from knowing nothing about SANs (at all) to selecting and purchasing one to having it running in production in the space of a couple of months - so I know what the learning curve feels like!! :-)
Cheers,
Matt