I've been creating my volumes =< 2TB _before_ I run SIS, to ensure that
SIS will never complain about working on a volume larger than it can
handle, including tape-restored volumes. For Windows file servers, I'm
seeing ~30% space savings. For VMWare servers running 40ish similar
Windows images, I'm getting ~90% space savings. Rough testing has
indicated that I'm taking a 7-9% performance hit.
Your snapshots will definitely grow during the SIS runs, so I would look
at turning snapshots off during that initial 'sis -s /vol/' process.
FWIW, I'm doing this with several LUNs per volume over FCP on a 3040.
Daniel
-----Original Message-----
From:
owner-toasters@... [mailto:
owner-toasters@...]
On Behalf Of Adam McDougall
Sent: Wednesday, March 26, 2008 9:08 PM
To:
toasters@...
Subject: Some thoughts and questions on A-sis
I started testing sis on one of our 3040's last weekend. Sorry if
theres more extensive information on NOW, but the things I've found so
far were fairly rudimentry. Some impressions and questions:
I found out quickly that my 5T test volume was too big. Looked up the
limits because I had no idea there were volume size limits. 3T for 3040
hmm. I have an existing volume that holds 2.5T, so that is pushing it,
and while I would like to split it up, it won't happen overnight. I'm
not desperate to use sis on it, although I am almost done copying it and
so far have realized 26% savings from dedupe for testing. I wonder why
the max size scales with the system model, I haven't thought of a good
reason for this yet since you could easily have lots of 3T volumes. I
was wondering if you can get sis to shrink say 2.8T to 2.0, is the 2.0
what counts for the sis volume limit size? Even so, would I run into
trouble if trying to do a full volume restore and the full data set is
over 3.0 but would have shrunk? Once I write over 3T into a volume, it
sounds like I cannot run sis on it. I'm not too worried about having to
do a full restore of a volume from tape, but I don't want to limit my
future options if the short term payoff isn't worth it. I'm pretty sure
I'll use it on all or most of all of my other volumes since they are
much smaller.
I noticed during my data copy, the snapshot reserve was blowing out big
time. It seemed like almost any savings from sis went into snapshots
for some reason. 100, 200, 400% full, and it started trimming itself
back after a while. Since this was a non-production copy, I didn't
care, and deleted those snapshots this morning, but I thought that was
rather odd, and couldn't think what the snapshots would have to offer
since if the data was unmodified at the filesystem level, the snapshot
should contain the same data. While not a problem for an initial copy,
I would expect the same thing to happen when sis runs during production,
and although it wouldn't be as large and would eventually flush out, why
am I expending space to store duplicate copies of data that I asked it
to deduplicate? :) Maybe its just because wafl identifies them as
"changed blocks" and insists on storing them in the snapshot.
A 1 gig file of zeros still takes several tens of megabytes after sis.
hmm :) And roughly twice as much for a second copy of it.
______________________________________________________________________
This email transmission and any documents, files or previous email
messages attached to it may contain information that is confidential or
legally privileged. If you are not the intended recipient or a person
responsible for delivering this transmission to the intended recipient,
you are hereby notified that you must not read this transmission and
that any disclosure, copying, printing, distribution or use of this
transmission is strictly prohibited. If you have received this transmission
in error, please immediately notify the sender by telephone or return email
and delete the original transmission and its attachments without reading
or saving in any manner.