Some thoughts and questions on A-sis

View: New views
12 Messages — Rating Filter:   Alert me  

Some thoughts and questions on A-sis

by Adam McDougall :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I started testing sis on one of our 3040's last weekend.  Sorry if
theres more extensive information on NOW, but the things I've found so
far were fairly rudimentry.  Some impressions and questions:

I found out quickly that my 5T test volume was too big.  Looked up the
limits because I had no idea there were volume size limits.  3T for 3040
hmm.  I have an existing volume that holds 2.5T, so that is pushing it,
and while I would like to split it up, it won't happen overnight.  I'm
not desperate to use sis on it, although I am almost done copying it and
so far have realized 26% savings from dedupe for testing.  I wonder why
the max size scales with the system model, I haven't thought of a good
reason for this yet since you could easily have lots of 3T volumes.  I
was wondering if you can get sis to shrink say 2.8T to 2.0, is the 2.0
what counts for the sis volume limit size?  Even so, would I run into
trouble if trying to do a full volume restore and the full data set is
over 3.0 but would have shrunk?  Once I write over 3T into a volume, it
sounds like I cannot run sis on it.  I'm not too worried about having to
do a full restore of a volume from tape, but I don't want to limit my
future options if the short term payoff isn't worth it.  I'm pretty sure
I'll use it on all or most of all of my other volumes since they are
much smaller.

I noticed during my data copy, the snapshot reserve was blowing out big
time.  It seemed like almost any savings from sis went into snapshots
for some reason.  100, 200, 400% full, and it started trimming itself
back after a while.  Since this was a non-production copy, I didn't
care, and deleted those snapshots this morning, but I thought that was
rather odd, and couldn't think what the snapshots would have to offer
since if the data was unmodified at the filesystem level, the snapshot
should contain the same data.  While not a problem for an initial copy,
I would expect the same thing to happen when sis runs during production,
and although it wouldn't be as large and would eventually flush out, why
am I expending space to store duplicate copies of data that I asked it
to deduplicate? :) Maybe its just because wafl identifies them as
"changed blocks" and insists on storing them in the snapshot.

A 1 gig file of zeros still takes several tens of megabytes after sis.  
hmm :) And roughly twice as much for a second copy of it.

RE: Some thoughts and questions on A-sis

by Kevin M. Parker :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>3T for 3040 hmm.  I have an existing volume that holds 2.5T, so that is
pushing it
Why's that pushing it? sis should decrease the amt of used in that volume as
you've already seen by 26%, maybe more after a few more passes.

>>I wonder why the max size scales with the system model, I haven't thought
of a good reason for this yet since you could easily have lots of 3T
volumes.
It scales with model because higher model >> more RAM and faster processors.
Thus sis can address more storage. Yes, lots of 3T volumes, but that amt of
3T volumes will scale again by model platform...higher model, more RAM and
processor, hence can address more storage (higher amount of disks).

>>I was wondering if you can get sis to shrink say 2.8T to 2.0, is the 2.0
what counts for the sis volume limit size?
sis uses the size of the volume at time of turning sis on. If it's larger
than your model will permit, it doesn't turn on and you'd have to shrink it.
You can't grow the volume above the model limit if sis is turned on, you'd
have to turn it off to grow.

>>Even so, would I run into trouble if trying to do a full volume restore
and the full data set is over 3.0 but would have shrunk?
If you had to do a restore, it would be coming from either a sis'd volume
elsewhere (say it's Snapmirrored), or a sis aware tape backup environment
(could be mistaken there).

>>I noticed during my data copy, the snapshot reserve was blowing out big
time.
This is normal. As you copy data in fresh, the snapshots (your backups) will
put pointers to the blocks that have changed (the new data flowing in), All
the data in essence has changed. If you're copying fresh data to the
storage, may as well turn snapshots off until all the data is copied in.
It won't be sis and snapshots contending here for same blocks...sis and
snapshot's won't paper-rock-scissors for the same blocks as they both, in
essence, treat blocks the same way (sis is highly based on snapshot
foundation when you break it down to the 4k block level).

hth,

Best regards,
~~~~~~~~~~~~~~~~~~~~~
Kevin Parker
http://theparkerz.com
~~~~~~~~~~~~~~~~~~~~~


-----Original Message-----
From: owner-toasters@... [mailto:owner-toasters@...] On
Behalf Of Adam McDougall
Sent: Wednesday, March 26, 2008 10:08 PM
To: toasters@...
Subject: Some thoughts and questions on A-sis

I started testing sis on one of our 3040's last weekend.  Sorry if theres
more extensive information on NOW, but the things I've found so far were
fairly rudimentry.  Some impressions and questions:

I found out quickly that my 5T test volume was too big.  Looked up the
limits because I had no idea there were volume size limits.  3T for 3040
hmm.  I have an existing volume that holds 2.5T, so that is pushing it, and
while I would like to split it up, it won't happen overnight.  I'm not
desperate to use sis on it, although I am almost done copying it and so far
have realized 26% savings from dedupe for testing.  I wonder why the max
size scales with the system model, I haven't thought of a good reason for
this yet since you could easily have lots of 3T volumes.  I was wondering if
you can get sis to shrink say 2.8T to 2.0, is the 2.0 what counts for the
sis volume limit size?  Even so, would I run into trouble if trying to do a
full volume restore and the full data set is over 3.0 but would have shrunk?
Once I write over 3T into a volume, it sounds like I cannot run sis on it.
I'm not too worried about having to do a full restore of a volume from tape,
but I don't want to limit my future options if the short term payoff isn't
worth it.  I'm pretty sure I'll use it on all or most of all of my other
volumes since they are much smaller.

I noticed during my data copy, the snapshot reserve was blowing out big
time.  It seemed like almost any savings from sis went into snapshots for
some reason.  100, 200, 400% full, and it started trimming itself back after
a while.  Since this was a non-production copy, I didn't care, and deleted
those snapshots this morning, but I thought that was rather odd, and
couldn't think what the snapshots would have to offer since if the data was
unmodified at the filesystem level, the snapshot should contain the same
data.  While not a problem for an initial copy, I would expect the same
thing to happen when sis runs during production, and although it wouldn't be
as large and would eventually flush out, why am I expending space to store
duplicate copies of data that I asked it to deduplicate? :) Maybe its just
because wafl identifies them as "changed blocks" and insists on storing them
in the snapshot.

A 1 gig file of zeros still takes several tens of megabytes after sis.  
hmm :) And roughly twice as much for a second copy of it.


RE: Some thoughts and questions on A-sis

by Glenn Walker :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

FWIW, I'm pretty sure that tape backups are not SIS aware (none are that
I'm aware of at least).  When you write the data to tape via NDMP as a
dump, then you're going to get _all_ data, and it will re-dupe it as
each of the flexvol data blocks are accessed (remember, the de-dupe
points multiple flex-vol blocks to a single aggregate block).  I would
think that snapmirror to tape would be an exception to this, but SMT
isn't really a tape backup per se compared to regular DUMP (in some ways
it's better, in some ways it's worse).

Glenn

-----Original Message-----
From: owner-toasters@... [mailto:owner-toasters@...]
On Behalf Of Kevin Parker
Sent: Wednesday, March 26, 2008 11:56 PM
To: 'Adam McDougall'; toasters@...
Subject: RE: Some thoughts and questions on A-sis

>>3T for 3040 hmm.  I have an existing volume that holds 2.5T, so that
is
pushing it
Why's that pushing it? sis should decrease the amt of used in that
volume as
you've already seen by 26%, maybe more after a few more passes.

>>I wonder why the max size scales with the system model, I haven't
thought
of a good reason for this yet since you could easily have lots of 3T
volumes.
It scales with model because higher model >> more RAM and faster
processors.
Thus sis can address more storage. Yes, lots of 3T volumes, but that amt
of
3T volumes will scale again by model platform...higher model, more RAM
and
processor, hence can address more storage (higher amount of disks).

>>I was wondering if you can get sis to shrink say 2.8T to 2.0, is the
2.0
what counts for the sis volume limit size?
sis uses the size of the volume at time of turning sis on. If it's
larger
than your model will permit, it doesn't turn on and you'd have to shrink
it.
You can't grow the volume above the model limit if sis is turned on,
you'd
have to turn it off to grow.

>>Even so, would I run into trouble if trying to do a full volume
restore
and the full data set is over 3.0 but would have shrunk?
If you had to do a restore, it would be coming from either a sis'd
volume
elsewhere (say it's Snapmirrored), or a sis aware tape backup
environment
(could be mistaken there).

>>I noticed during my data copy, the snapshot reserve was blowing out
big
time.
This is normal. As you copy data in fresh, the snapshots (your backups)
will
put pointers to the blocks that have changed (the new data flowing in),
All
the data in essence has changed. If you're copying fresh data to the
storage, may as well turn snapshots off until all the data is copied in.
It won't be sis and snapshots contending here for same blocks...sis and
snapshot's won't paper-rock-scissors for the same blocks as they both,
in
essence, treat blocks the same way (sis is highly based on snapshot
foundation when you break it down to the 4k block level).

hth,

Best regards,
~~~~~~~~~~~~~~~~~~~~~
Kevin Parker
http://theparkerz.com
~~~~~~~~~~~~~~~~~~~~~


-----Original Message-----
From: owner-toasters@... [mailto:owner-toasters@...]
On
Behalf Of Adam McDougall
Sent: Wednesday, March 26, 2008 10:08 PM
To: toasters@...
Subject: Some thoughts and questions on A-sis

I started testing sis on one of our 3040's last weekend.  Sorry if
theres
more extensive information on NOW, but the things I've found so far were
fairly rudimentry.  Some impressions and questions:

I found out quickly that my 5T test volume was too big.  Looked up the
limits because I had no idea there were volume size limits.  3T for 3040
hmm.  I have an existing volume that holds 2.5T, so that is pushing it,
and
while I would like to split it up, it won't happen overnight.  I'm not
desperate to use sis on it, although I am almost done copying it and so
far
have realized 26% savings from dedupe for testing.  I wonder why the max
size scales with the system model, I haven't thought of a good reason
for
this yet since you could easily have lots of 3T volumes.  I was
wondering if
you can get sis to shrink say 2.8T to 2.0, is the 2.0 what counts for
the
sis volume limit size?  Even so, would I run into trouble if trying to
do a
full volume restore and the full data set is over 3.0 but would have
shrunk?
Once I write over 3T into a volume, it sounds like I cannot run sis on
it.
I'm not too worried about having to do a full restore of a volume from
tape,
but I don't want to limit my future options if the short term payoff
isn't
worth it.  I'm pretty sure I'll use it on all or most of all of my other
volumes since they are much smaller.

I noticed during my data copy, the snapshot reserve was blowing out big
time.  It seemed like almost any savings from sis went into snapshots
for
some reason.  100, 200, 400% full, and it started trimming itself back
after
a while.  Since this was a non-production copy, I didn't care, and
deleted
those snapshots this morning, but I thought that was rather odd, and
couldn't think what the snapshots would have to offer since if the data
was
unmodified at the filesystem level, the snapshot should contain the same
data.  While not a problem for an initial copy, I would expect the same
thing to happen when sis runs during production, and although it
wouldn't be
as large and would eventually flush out, why am I expending space to
store
duplicate copies of data that I asked it to deduplicate? :) Maybe its
just
because wafl identifies them as "changed blocks" and insists on storing
them
in the snapshot.

A 1 gig file of zeros still takes several tens of megabytes after sis.  
hmm :) And roughly twice as much for a second copy of it.



RE: Some thoughts and questions on A-sis

by Daniel Keisling :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've been creating my volumes =< 2TB _before_ I run SIS, to ensure that
SIS will never complain about working on a volume larger than it can
handle, including tape-restored volumes.  For Windows file servers, I'm
seeing ~30% space savings.  For VMWare servers running 40ish similar
Windows images, I'm getting ~90% space savings.  Rough testing has
indicated that I'm taking a 7-9% performance hit.

Your snapshots will definitely grow during the SIS runs, so I would look
at turning snapshots off during that initial 'sis -s /vol/' process.  

FWIW, I'm doing this with several LUNs per volume over FCP on a 3040.

Daniel

-----Original Message-----
From: owner-toasters@... [mailto:owner-toasters@...]
On Behalf Of Adam McDougall
Sent: Wednesday, March 26, 2008 9:08 PM
To: toasters@...
Subject: Some thoughts and questions on A-sis

I started testing sis on one of our 3040's last weekend.  Sorry if
theres more extensive information on NOW, but the things I've found so
far were fairly rudimentry.  Some impressions and questions:

I found out quickly that my 5T test volume was too big.  Looked up the
limits because I had no idea there were volume size limits.  3T for 3040

hmm.  I have an existing volume that holds 2.5T, so that is pushing it,
and while I would like to split it up, it won't happen overnight.  I'm
not desperate to use sis on it, although I am almost done copying it and

so far have realized 26% savings from dedupe for testing.  I wonder why
the max size scales with the system model, I haven't thought of a good
reason for this yet since you could easily have lots of 3T volumes.  I
was wondering if you can get sis to shrink say 2.8T to 2.0, is the 2.0
what counts for the sis volume limit size?  Even so, would I run into
trouble if trying to do a full volume restore and the full data set is
over 3.0 but would have shrunk?  Once I write over 3T into a volume, it
sounds like I cannot run sis on it.  I'm not too worried about having to

do a full restore of a volume from tape, but I don't want to limit my
future options if the short term payoff isn't worth it.  I'm pretty sure

I'll use it on all or most of all of my other volumes since they are
much smaller.

I noticed during my data copy, the snapshot reserve was blowing out big
time.  It seemed like almost any savings from sis went into snapshots
for some reason.  100, 200, 400% full, and it started trimming itself
back after a while.  Since this was a non-production copy, I didn't
care, and deleted those snapshots this morning, but I thought that was
rather odd, and couldn't think what the snapshots would have to offer
since if the data was unmodified at the filesystem level, the snapshot
should contain the same data.  While not a problem for an initial copy,
I would expect the same thing to happen when sis runs during production,

and although it wouldn't be as large and would eventually flush out, why

am I expending space to store duplicate copies of data that I asked it
to deduplicate? :) Maybe its just because wafl identifies them as
"changed blocks" and insists on storing them in the snapshot.

A 1 gig file of zeros still takes several tens of megabytes after sis.  
hmm :) And roughly twice as much for a second copy of it.


______________________________________________________________________
This email transmission and any documents, files or previous email
messages attached to it may contain information that is confidential or
legally privileged. If you are not the intended recipient or a person
responsible for delivering this transmission to the intended recipient,
you are hereby notified that you must not read this transmission and
that any disclosure, copying, printing, distribution or use of this
transmission is strictly prohibited. If you have received this transmission
in error, please immediately notify the sender by telephone or return email
and delete the original transmission and its attachments without reading
or saving in any manner.



RE: Some thoughts and questions on A-sis

by John Stoffel-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


So, if A-SIS is limited to volumes of under 3.5Tb, what do those of us
with 10Tb volumes do?  

John

RE: Some thoughts and questions on A-sis

by John Clear-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The size limit varies on what filer you are on.

The limit is 10TB on the 6030 and 14TB on the 6070.

The table of the various limits is here:
http://now.netapp.com/NOW/knowledge/docs/ontap/rel723/html/ontap/onlineb
k/6asis.htm

John

-----Original Message-----
From: owner-toasters@... [mailto:owner-toasters@...]
On Behalf Of John Stoffel
Sent: Thursday, March 27, 2008 1:29 PM
To: Daniel Keisling
Cc: toasters@...
Subject: RE: Some thoughts and questions on A-sis


So, if A-SIS is limited to volumes of under 3.5Tb, what do those of us
with 10Tb volumes do?  

John




Parent Message unknown Re: Some thoughts and questions on A-sis

by John Stoffel-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Scott> 6T on 6070's

Heh.  I'm running on 960s and 980s now, so this won't help me much,
even with an upgrade planned for later this year.  Maybe.  

Scott> in a few cases, I split big volumes into smaller chunks
Scott> and stiched them back together with automount maps an DFS
Scott> links (my filesystem namespace is already built that way).

We've don't this too, but it's a total hassle when one volume has to
grow, and I don't want to go down symlink hell again if I can help
it.  

Scott> this is worth doing if the data set is A-sis friendly.

It probably is actually, but hard to know.

Scott> Future OnTap versions are going to make A-sis an aggregate
Scott> behavior, not a volume one, which removes the volume size
Scott> limit.

Scott> and reinforces the silly 16T aggregate limit ;-)

Yeah, that's another silly limit, esp with raid sets and RaidDP
stuff.  They should just let it scale and scale and scale.  

I personally *like* one big volume, with bunches of qtrees.  What I'd
really like is qtrees on multiple levels, or the raising of the number
of volumes that are supported in an aggregate.  That would help.

And speeing up SnapVault.  And a pony... :]

John

Parent Message unknown Re: Some thoughts and questions on A-sis

by John Stoffel-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Scott> no symlinks, automounter sub-mount maps.  I have 600+ qtrees in
Scott> 3 sites with NFS caches in between, in a single name space
Scott> using a gory mesh of automount maps and automounter variables
Scott> defined on clients.

We've done that, but when a single project out grows a volume, then we
start needing to shuffle data... it's a pain.

If I could have more volumes, then I'd use them over qtrees, but then
when an Aggr fills, I need to move volumes.  It's a pain.

That's why I like the idea of the Acopia product.  I wish NetApp would
realize that and come out with their own storage virtualization
product to put in front of backend NetApps.  Would be really nice.

Course I'm a mostly NFS only shop.

Scott> this is worth doing if the data set is A-sis friendly.
>>
>> It probably is actually, but hard to know.
>>
Scott> Future OnTap versions are going to make A-sis an aggregate
Scott> behavior, not a volume one, which removes the volume size
Scott> limit.
>>
Scott> and reinforces the silly 16T aggregate limit ;-)
>>
>> Yeah, that's another silly limit, esp with raid sets and RaidDP
>> stuff.  They should just let it scale and scale and scale.  

Scott> really a problem with 1 TB disks; 16 drives, onr Raid-DB group
Scott> per aggregate.  stinky performance.

>> I personally *like* one big volume, with bunches of qtrees.  What I'd
>> really like is qtrees on multiple levels, or the raising of the number
>> of volumes that are supported in an aggregate.  That would help.
>>
>> And speeing up SnapVault.  And a pony... :]
>>
>> John


RE: Some thoughts and questions on A-sis

by Leeds, Daniel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

RE: Some thoughts and questions on A-sis

i really wish a product like acopia could work with live oracle data over nfs as well.  we would purchase this in a heartbeat.

--
Daniel Leeds
Manager, Storage Operations
Edmunds, Inc.
1620 26th Street, Suite 400 South
Santa Monica, CA 90404

310-309-4999 desk
310-430-0536 cell



-----Original Message-----
From: owner-toasters@... on behalf of John Stoffel
Sent: Fri 3/28/2008 2:09 PM
To: Scott Miller
Cc: John Stoffel; toasters@...
Subject: Re: Some thoughts and questions on A-sis


Scott> no symlinks, automounter sub-mount maps.  I have 600+ qtrees in
Scott> 3 sites with NFS caches in between, in a single name space
Scott> using a gory mesh of automount maps and automounter variables
Scott> defined on clients.

We've done that, but when a single project out grows a volume, then we
start needing to shuffle data... it's a pain.

If I could have more volumes, then I'd use them over qtrees, but then
when an Aggr fills, I need to move volumes.  It's a pain.

That's why I like the idea of the Acopia product.  I wish NetApp would
realize that and come out with their own storage virtualization
product to put in front of backend NetApps.  Would be really nice.

Course I'm a mostly NFS only shop.

Scott> this is worth doing if the data set is A-sis friendly.
>>
>> It probably is actually, but hard to know.
>>
Scott> Future OnTap versions are going to make A-sis an aggregate
Scott> behavior, not a volume one, which removes the volume size
Scott> limit.
>>
Scott> and reinforces the silly 16T aggregate limit ;-)
>>
>> Yeah, that's another silly limit, esp with raid sets and RaidDP
>> stuff.  They should just let it scale and scale and scale. 

Scott> really a problem with 1 TB disks; 16 drives, onr Raid-DB group
Scott> per aggregate.  stinky performance.

>> I personally *like* one big volume, with bunches of qtrees.  What I'd
>> really like is qtrees on multiple levels, or the raising of the number
>> of volumes that are supported in an aggregate.  That would help.
>>
>> And speeing up SnapVault.  And a pony... :]
>>
>> John



RE: Some thoughts and questions on A-sis

by Darren Sykes :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>We've done that, but when a single project out grows a volume, then we
>start needing to shuffle data... it's a pain.
>
>If I could have more volumes, then I'd use them over qtrees, but then
>when an Aggr fills, I need to move volumes.  It's a pain.
>
>That's why I like the idea of the Acopia product.  I wish NetApp would
>realize that and come out with their own storage virtualization
>product to put in front of backend NetApps.  Would be really nice.
>
>Course I'm a mostly NFS only shop.

Aren't you basically describing ONTAP GX (on the fly volume moves, 1000
volumes per pair of filers, up to 24 filers using a single namespace
which can be used to stitch volumes together etc)?

 


RE: Some thoughts and questions on A-sis

by John Stoffel-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>>>>> "Darren" == Darren Sykes <Darren.Sykes@...> writes:

>> We've done that, but when a single project out grows a volume, then we
>> start needing to shuffle data... it's a pain.
>>
>> If I could have more volumes, then I'd use them over qtrees, but then
>> when an Aggr fills, I need to move volumes.  It's a pain.
>>
>> That's why I like the idea of the Acopia product.  I wish NetApp would
>> realize that and come out with their own storage virtualization
>> product to put in front of backend NetApps.  Would be really nice.
>>
>> Course I'm a mostly NFS only shop.

Darren> Aren't you basically describing ONTAP GX (on the fly volume
Darren> moves, 1000 volumes per pair of filers, up to 24 filers using
Darren> a single namespace which can be used to stitch volumes
Darren> together etc)?

But GX doesn't support SnapVault at all, and we need (maybe?) it for
our DR work across our WAN.  Maybe if we consolidated down to just a
couple of sites and mirrored them across the country GX would work.  I
doubt it though.

John

 

RE: Some thoughts and questions on A-sis

by Darren Sykes :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

But GX doesn't support SnapVault at all, and we need (maybe?) it for
our DR work across our WAN.  Maybe if we consolidated down to just a
couple of sites and mirrored them across the country GX would work.  I
doubt it though.

John


That's pretty much what we do. However, the GX training material does
suggest that a new replication engine is being developed which will
allow interoperability with SnapVault and 7G systems in the future....