aggregating disks across multiple machines

View: New views
9 Messages — Rating Filter:   Alert me  

aggregating disks across multiple machines

by Michael James-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've been asked to recommend a setup for a group of high end  
workstations.
They each have dual 4 core processors, 32 Gig of ram and 2 x 1.5 TB  
disks.
Nice.

At present each machine has a separate 1.5TB /home and /data partition.
Bad:
        no redundancy, data will be lost from a single disk failure.
        files will be copied around as jobs dictate => confusion and waste.
        when partitions start to fill up, files will get put where they fit
                not where they should go => even greater confusion.
       
A much better solution would be aggregate disks
  into a single, large, duplicated, data warehouse
  and have it accessible to all machines.

In the past I'd have moved all the disks to 2 machines,*
  raided them into 2 disk packs,  a master and backup,
  NFS mounted the master on all machines,
  and set up a nightly rsync to refresh the backup.

Nowadays would it be better to use Lustre?

Or is there an updated distributed NFS?
One that can maintain multiple copies of an NFS data repository
  and cache a file locally when needed,
  reflecting changes back the master when necessary.

Or should I look at a global file system?

Which one?

TIA,
michaelj

--
Well theme my emoticons disgusted. What has Linux come to?
Michael James

* Yes, I'd find or buy some small disks for the OS on the stripped  
machines.
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Daniel Pittman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michael James <michael@...> writes:

> I've been asked to recommend a setup for a group of high end workstations.
> They each have dual 4 core processors, 32 Gig of ram and 2 x 1.5 TB disks.
> Nice.
>
> At present each machine has a separate 1.5TB /home and /data partition.
> Bad:
> no redundancy, data will be lost from a single disk failure.
> files will be copied around as jobs dictate => confusion and waste.
> when partitions start to fill up, files will get put where they fit
> not where they should go => even greater confusion.

Mmmm.  Some of that sounds like a user training and control issue, not a disk
layout issue, to me.  Assuming y'all do provide a central server:

1. Staff should keep only working copies locally, the master should be
   elsewhere, ideally in a VCS style "edit, then commit" arrangement.

   Failing that, provide a central repository and have them work there.

2. Staff shouldn't have (easy) access to store stuff in the "wrong" place on
   their machine.


> A much better solution would be aggregate disks into a single, large,
> duplicated, data warehouse and have it accessible to all machines.

...maybe.  It depends a lot on how your data is used; given the size of the
workstation I am *guessing* that your workload involves a lot of data pounding
on the workstation.

Pushing that to a central machine implies that you need to deliver N * <the
size of your group> random IOPS on that machine, instead of just N random IOPS
per machine.

That probably isn't cost-effective for you.

> In the past I'd have moved all the disks to 2 machines,* raided them into 2
> disk packs, a master and backup, NFS mounted the master on all machines, and
> set up a nightly rsync to refresh the backup.
>
> Nowadays would it be better to use Lustre?

Probably not.

> Or is there an updated distributed NFS?

GLusterFS is probably the best but, but I suspect it will not do exactly what
you want.

> One that can maintain multiple copies of an NFS data repository and cache a
> file locally when needed, reflecting changes back the master when necessary.
> Or should I look at a global file system?

Almost certainly not, IMO.  The cost, in performance and complexity, is likely
to overcome *any* benefit from a shared namespace you get.


I strongly suspect, given the nature of the machines, that your best bet is
some sort of VCS working process:

The user grabs a local, scratch copy of the data they need to work with.
They mutate it, or whatever.
They push the results back to the central store.

Ideally, use a real VCS, but I /bet/ none of them scale to your needs.  That
means a standard, relaxed, "talk to each other" locking protocol. ;)


I would probably use RAID-1 for the OS partition, from which the users are
locked out, and RAID-0 the scratch partition they work on.

Then, ensure you have a backup solution to slow, boring, cheap disk that sucks
up those scratch disks at least once a day, deduplicates them, and keeps them
accessible for when a disk fails, or when a user fails, and local data is
lost.

(Plus, obviously, good backups for your central repository ;)

        Daniel

--
✣ Daniel Pittman            ✉ daniel@...            ☎ +61 401 155 707
               ♽ made with 100 percent post-consumer electrons
   Looking for work?  Love Perl?  In Melbourne, Australia?  We are hiring.
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Andrew Janke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> 1. Staff should keep only working copies locally, the master should be
>   elsewhere, ideally in a VCS style "edit, then commit" arrangement.

Except in the case where the data is enormous. In what we do for
example I can generate 1TB in 24 hours without any sweat.

>> A much better solution would be aggregate disks into a single, large,
>> duplicated, data warehouse and have it accessible to all machines.
>
> ...maybe.  It depends a lot on how your data is used; given the size of the
> workstation I am *guessing* that your workload involves a lot of data pounding
> on the workstation.

+1

> Pushing that to a central machine implies that you need to deliver N * <the
> size of your group> random IOPS on that machine, instead of just N random IOPS
> per machine.
>
> That probably isn't cost-effective for you.

+1

> Almost certainly not, IMO.  The cost, in performance and complexity, is likely
> to overcome *any* benefit from a shared namespace you get.

In these cases given that a single user is usually on one machine I
set up a bunch of AutoFS cross mounts and put each users data on their
own machine (and backed up to another via rsync at night). Then if you
are running some sort of clustering/batch processing thing you get the
best of both worlds. I would be VERY cautious about using a
distributed file system as you only need one user to crash X when
npviewer bombs on some webpage or a kernel panic by some other means
and you have a hung filesystem across all the machines. Or perhaps
only one machines RAM/HDD to die and everything goes down.

So I define a mountpoint called /giglo and set up the extra space
(locally mounted as /export{01,02,03}) as such:

murdoch:~$ cat /etc/exports
/export01   10.10.100.0/255.255.255.0(rw,sync,subtree_check)

murdoch:~$ cat /etc/auto.master
# /etc/auto.master

/giglo            /etc/AutoFS/auto.giglo     -fstype=nfs,hard,intr,nodev

murdoch:~$ cat /etc/AutoFS/auto.giglo
#! /usr/bin/perl
#
# /etc/AutoFS/auto.giglo
#
# Andrew Janke - a.janke@...
#
# put something like this in auto.master
# giglo    /etc/AutoFS/auto.giglo     -fstype=nfs,hard,intr,nodev

use strict;
use Socket;

# static vars
my $mount_options = "-fstype=nfs,hard,intr,nodev";
my $giglo_net = "10.10.100";
my $domain = "lclust";

my($input, $server, $drive_num, $server_ip, $last_digit);

my $input = shift(@ARGV);

# split the server and number
($server, $drive_num) = split(/\-/, $input, 2);

# add default drive (01) if missing
$drive_num = '01' if length($drive_num) == 0;

# reality check MKI
exit if length($drive_num) != 2;

# get and clean up the IP address of the current host
# chomp($server_ip = `dig +short $server.$domain`);
$server_ip = inet_ntoa(inet_aton($server));

# reality check MKII
exit if $server_ip eq "";

# output the resulting mount line
print "$mount_options\t$server_ip:/export$drive_num\n";

Note that you will have to modify the "giglo_net" to your own IP
address space. I use giglo as I tend to run a second gigabit local
network among the machines to segregate NFS and other traffic.

So then if you have two machines called bill and ben, on all machines
you can access their /export01 drive as such:

murdoch:~$ ls /giglo/ben-01

or

murdoch:~$ ls /giglo/ben-02

for /export02

So users can still write their scripts with consistent mountpoints and
AutoFS will sort things out on the local machine.


--
Andrew Janke
(a.janke@... || http://a.janke.googlepages.com/)
Canberra->Australia    +61 (402) 700 883
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Daniel Pittman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andrew Janke <a.janke@...> writes:

>> 1. Staff should keep only working copies locally, the master should be
>>   elsewhere, ideally in a VCS style "edit, then commit" arrangement.
>
> Except in the case where the data is enormous. In what we do for
> example I can generate 1TB in 24 hours without any sweat.

I should perhaps have said "mutate, then save" or something, since I imagine
that you either produce smaller results after processing that data, or you
have an entirely different scale of problem to the OP. :)

        Daniel

--
✣ Daniel Pittman            ✉ daniel@...            ☎ +61 401 155 707
               ♽ made with 100 percent post-consumer electrons
   Looking for work?  Love Perl?  In Melbourne, Australia?  We are hiring.
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Michael James-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 26/10/2009, at 9:36 PM, Andrew Janke wrote:

> In these cases given that a single user is usually on one machine
> I set up a bunch of AutoFS cross mounts
> and put each users data on their own machine
> (and backed up to another via rsync at night).

You're dead right, and this solution is achievable.
I'll study those scripts and configurations carefully.

Thanks,
michaelj

--
Well theme my emoticons disgusted. What has Linux come to?
Michael James





--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Paul Wayper :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 26/10/09 17:49, Daniel Pittman wrote:

> Michael James<michael@...>  writes:
>
>> I've been asked to recommend a setup for a group of high end workstations.
>> They each have dual 4 core processors, 32 Gig of ram and 2 x 1.5 TB disks.
>> Nice.
>>
>> At present each machine has a separate 1.5TB /home and /data partition.
>> Bad:
>> no redundancy, data will be lost from a single disk failure.
>> files will be copied around as jobs dictate =>  confusion and waste.
>> when partitions start to fill up, files will get put where they fit
>> not where they should go =>  even greater confusion.
>
> Mmmm.  Some of that sounds like a user training and control issue, not a disk
> layout issue, to me.  Assuming y'all do provide a central server:

I think Michael was talking about using both machines as some kind of
distributed storage, rather than a 'central server'.  I want to find out about
this too.  The key problem is that a lot of the cluster storage that works
like this assumes that each machine is accessing the same backend store.  This
is convenient for those that have infiniband or fiberchannel cards lying
around and SAN units sitting in their cupboards, but for those of us with just
standard machines I haven't found any obvious candidates.

Anyone seen something that makes a bunch of disks spread across multiple
machines act like a big communal block device?

Have fun,

Paul
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Daniel Pittman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Paul Wayper <paulway@...> writes:

> On 26/10/09 17:49, Daniel Pittman wrote:
>> Michael James<michael@...>  writes:
>>
>>> I've been asked to recommend a setup for a group of high end workstations.
>>> They each have dual 4 core processors, 32 Gig of ram and 2 x 1.5 TB disks.
>>> Nice.
>>>
>>> At present each machine has a separate 1.5TB /home and /data partition.
>>> Bad:
>>> no redundancy, data will be lost from a single disk failure.
>>> files will be copied around as jobs dictate =>  confusion and waste.
>>> when partitions start to fill up, files will get put where they fit
>>> not where they should go =>  even greater confusion.
>>
>> Mmmm.  Some of that sounds like a user training and control issue, not a disk
>> layout issue, to me.  Assuming y'all do provide a central server:
>
> I think Michael was talking about using both machines as some kind of
> distributed storage, rather than a 'central server'.  I want to find out about
> this too.  The key problem is that a lot of the cluster storage that works
> like this assumes that each machine is accessing the same backend store.  This
> is convenient for those that have infiniband or fiberchannel cards lying
> around and SAN units sitting in their cupboards, but for those of us with just
> standard machines I haven't found any obvious candidates.
>
> Anyone seen something that makes a bunch of disks spread across multiple
> machines act like a big communal block device?

Yeah: GLusterFS.  It does exactly this, and is almost certainly what you
want.  The alternatives tend to look like Hadoop or so — a dedicated storage
solution for a data processing system, not a filesystem.

You would probably want to unify[1], and perhaps the BDB backed store[2], for
this; perhaps AFR[3] if you really felt enthused, but it is still not quite
where I would like a replicated storage device to be.

Apparently, though, the latest release hides all that behind a sane
interface.  Go, GLusterFS developers.

        Daniel

Footnotes:
[1]  Single namespace over multiple machines.

[2]  Stores small files in a BDB spool, excellent for many small files, still
     looks like a POSIX filesystem to the client.

[3]  Mirroring, basically.

--
✣ Daniel Pittman            ✉ daniel@...            ☎ +61 401 155 707
               ♽ made with 100 percent post-consumer electrons
   Looking for work?  Love Perl?  In Melbourne, Australia?  We are hiring.
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Carlo Hamalainen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 28, 2009 at 9:15 AM, Paul Wayper <paulway@...> wrote:
> Anyone seen something that makes a bunch of disks spread across multiple
> machines act like a big communal block device?

So basically RAID + NFS?
http://motomastyle.com/network-raid-storage-proof-of-concept/

--
Carlo Hamalainen
http://carlo-hamalainen.net
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux

Re: aggregating disks across multiple machines

by Matt Hope :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Oct 26, 2009 at 16:51, Michael James <michael@...> wrote:
> A much better solution would be aggregate disks
>  into a single, large, duplicated, data warehouse
>  and have it accessible to all machines.

In the past I've used PVFS [http://www.pvfs.org/] to do exactly this.
--
linux mailing list
linux@...
https://lists.samba.org/mailman/listinfo/linux