Another point that I came up with is how GENI deals with both planned
and unplanned outages wrt the experiments they affect.
In the case of an unplanned outage, say a particular slice is using a
particular fiber which suffers a "backhoe fade" event or some other
outage does the researcher want:
- to just get notified?
- to use it as a test for the failure resilience of their design?
- to have some automatic reroute happen?
- or is there something else?
Which of these options can/should be offered? Is there a clear
default, or should this be something the researcher has to think about
on a experiment by experiment basis?
For planned outages, there can be even more options. Say the owner of
some compute cluster needs to take it down for maintenance, or
whatever reason. If they post a notification of that, the researcher
could just let it happen and treat it like an unplanned outage, or
they could plan ahead and migrate services to other components outside
the group that will be down. Of course, if their experiment is about
automatic migration and failover, they might welcome a planned outage
as they could make it a point of being there to observe the failover
in real time.
As I was writing this message, another closely related question
occurred to me. I haven't thought about this much, but I wonder about
techniques to inject failures into experiments (having a sliver or two
fail, without actually taking out the whole component). That's
certainly something that GENI should offer the researchers, but I
haven't seen any discussion of how that would be done.
-MAP
_______________________________________________
omis-wg mailing list
omis-wg@...
http://lists.geni.net/mailman/listinfo/omis-wg