This is a note motivated by topics raised at the GEC6 Control Framework
WG meeting. Comments, criticism, feedback, and corrections are
welcome.
The issue of slice stitching has come up periodically and in the
interest of making some progress on it, I wanted to propose a mechanism
for stitching together aggregates using VLANs. What is slice
stitching? Slice stitching refers to the process of
interconnecting slivers between is different GENI aggregates. In the
near
term, GENI needs to be able to create Ethernet VLANs that connect
aggregates (although over the longer term more diverse
interconnections
will be desired).
Jeff Chase, in his slides at the GEC6 CFWG meeting
[1], catalogs a very good list of questions on stitching:
- How to join slivers/slices across different aggregates
end-to-end?
- Do we require common labels at junction points?
- How to connect slivers?
- Do aggregates negotiate with each other? (peer-to-peer) or a
clearinghouse or service such as a slice manager coordinate? (top-down)
- What about isolation for performance or security?
The mechanism I propose below address these questions for the narrow
application of establishing end-to-end Ethernet VLANs. Rather than try
to solve the general problem, my goal is to establish a straightforward
way to do stitching so that GENI aggregates and tools under development
will understand what functions are required.
Let me illustrate by an example.
+--------------------------------+
+---------->| Stitching Manager Service (S) |<----------+
| +--------------------------------+ |
| | |
V V V
+--------------+ +--------------+ +--------------+
||AM| | | |AM| | | |AM||
|+--+ +--| |--+ +--+ +--| |--+ +--+|
| | |........| | | |........| | |
| |SW|........|SW| |SW|........|SW| |
| | |........| | | |........| | |
| +--| usable |--+ +--| usable |--+ |
|ProtoGENI (PG)| VLANs |Internet2 (I2)| VLANs |OpenFlow (OF) |
+--------------+ +--------------+ +--------------+
- A ProtoGENI cluster (PG), Internet2 (I2) and an campus OpenFlow
network containing several hosts (OF) are configured to function as
GENI aggregates.
- The ProtoGENI cluster administrator provisions connectivity to an
Internet2 PoP through a regional network. I2, PG, and the regional
network administrators engineer the network, provisioning a set of
VLANs for GENI use PG site and the I2 PoP. The regional network can be
thought of as a 'wire'. The engineering of the network is worked out
between the participants and is a 'local' matter. The result is a set
of VLANs known to the PG and I2 admins and pre-allocated for GENI use.
- A similar process occurs between Internet2 and the campus
OpenFlow network.
- A researcher now wishes to create a slice containing resources
from PG and OF using I2 to provide network connectivity between them.
The researcher has acquired slice credentials allocated by a slice
authority recognized by all three aggregates.
- The researcher (via the GENI Aggregate Manger API) requests a
sliver containing hosts connected by a topology on the ProtoGENI
cluster. The AM allocates the topology and hosts but does not yet
connect them to the outside world.
- The above step is applied to the campus OpenFlow network.
- The researcher now requests an I2 sliver providing Ethernet
connectivity between the ProtoGENI cluster to the OpenFlow network.
The I2 AM allocates the topology but does not yet connect it to the
outside world. At this point, three disconnected slivers have been
established.
- The researcher now provides his slice credentials to a stitching
manager service, S, with two requests: stitch his PG and I2 slivers and
stitch his I2 and OF slivers. S, using a pre-established rule,
determines the sort order for stitching is PG, OF, I2, meaning
that for the PG-I2 VLAN, PG is contacted first and for the I2-OF VLAN,
OF is contacted
first.
- S contacts the ProtoGENI AM, forwarding the slice credentials and
the
request to connect the sliver to Internet2. The PG AM, using
local policy determined by the ProtoGENI administrator, assigns a VLAN
connecting the ProtoGENI cluster to Internet2 to this
slice. The PG-I2 VLAN identifying information is provided to S. Even
though the mapping
has been determined, the PG switch is configured to drop traffic on the
allocated VLANs until there is confirmation that the all stitching
required by the slice is complete. This is to avoid the possibility of
traffic injected into a partially configured network.
- S now contacts the I2 AM providing the slice credentials and the
PG-I2 VLAN identifying information. The I2 AM prepares the mapping
between the I2 internal
network and the PG-I2 VLAN. However, as within PG, the I2 switch is
configured to drop traffic on the allocated VLANs until
there is confirmation that the all stitching required by the slice is
complete.
- The previous two steps are repeated with OF and I2, starting with
OF (as stated in step 8). At this
point S, knows the identifying information for all the stitching VLANs
assigned to this slice. This
information is stored for operations and forensic use. S also has
confirmation that the stitching has been completed.
- S sends an indication to PG, OF, and I2 that the end-to-end
network is configured. Now the rules to drop traffic on the assigned
VLANs are removed and each switch is configured to translate VLAN
traffic between the assigned stitching VLAN and the internal network.
Each network sends a confirmation back to S.
- S tells the researcher the end-to-end network is in place.
Some of the assumptions here:
- We assume both switches must be able to do VLAN translation.
There are other techniques for stitching together VLANs but requiring
translation at every switch will add minimal constraints on how to
stitching might be established. In other words, VLANs available for
inter-aggregate connectivity won't be constrained by VLAN IDs in use
within either aggregate. VLAN translation won't be uniformly
available throughout GENI for several years and the things will be more
complex in the
intervening period.
- VLANs are assumed to be pre-established before the stitching
process is started. Inter-aggregate VLANs will have some isolation and
performance characteristics assigned to them when created, i.e., there
may be some performance guarantees that can be
made for some VLANs but this model isn't intended to support on-demand
per-slice QoS negotiation on the stitching VLANs.
- The policy by which an internal VLAN is mapped to an stitching
VLAN is entirely local and under the control of the aggregate.
- We assume all networks can be represented as either an aggregate
or a static VLAN (a 'wire'). I believe the implication here is that
the backbones behave like aggregates.
- We assume that S picks the first aggregate in an unambiguous and
repeatable way (e.g., in some sort order) to avoid race conditions
where both A and B give out the same VLAN to different users.
- We assume that the aggregate manager can configure the switch to
connect the assigned VLAN to the network resources allocated to the
slice.
- Once a VLAN has been assigned to a slice for stitching, it has to
be reported and recorded by the GENI clearinghouse for operations and
forensic use. Therefore, the VLAN identifying information needs to be
in a standard form.
Questions I have:
- The AM is required since the binding is between resources
allocated to a slice and the VLAN. Will an extension to the aggregate
API be required to support the protocol above? Perhaps not: this sort
of looks like a 'revise an existing slice' operation.
- It seems like the 'usable VLANs' could be multiplexed over a
802.11QinQ, GRE, OpenVPN, or (G)MPLS tunnels. E.g.,
if QinQ or even an IP tunnel is used, the tunnel should be established
but VLAN IDs should still be passed to permit demultiplexing at the
edges. What are the implications?
Let me conclude by saying I'm sending this to the Control Framework and
Experimenter Workflow and
Services working groups because I believe there are implications for
both groups embedded in this proposal. This is an important
experimenter service that is not part of the
control plane (and thus in scope for the services-wg. Additionally,
aggregate
managers and RSpecs would need to support this protocol, making it
relevant
to the control-wg. I think if we can get a rough agreement that
something like this would work, I'd like to hear what would be needed
to prototype it.
regards,
--aaron
[1]
http://groups.geni.net/geni/attachment/wiki/GEC6CFWGAgenda/gec6-cf-chase.ppt
_______________________________________________
services-wg mailing list
services-wg@...
http://lists.geni.net/mailman/listinfo/services-wg