Hi Daniel,
> Few questions/comments:
>
> - I understand/agree with the general assertion that BGP is inherently
> more scalable than IGPs, also in PE-CE application. But I don't
> understand the statement that "(IGPs) invoke additional processes on the
> router when compared to simply using BGP (which is already going be
> running on a router using MP-BGP for VPNs"). This is because even if BGP
> is already running on the router as PE-PE protocol, using BGP as PE-CE
> protocol will also typically required "additional processes" in the
> router as each BGP instance will typically run on a separate BGP
> process. Barring specification implementations, I don't see how BGP and
> IGPs as PE-CE differ in this aspect.
BGP does not have a new process/thread/data structure on a per VPN basis.
All VPN routes are stored in single vpnv4 trie and the only real
overhead you add is additional cycles for new neighbor(s) update
generation. To make the routes unique across VPNs RD is part of such
trie. That is the same on the PE as it on the RR.
Contrary to this when you use IGP on the PE-CE for example OSPF you need
to keep an istance of LSDB as I am not aware of any efforts to modify
IGP LSDB to be VPN aware. Same for topology SPF, etc ...
RIP could be implemented much easier to be VPN aware how ever as it's
use is marginal on the PE-CE I am not sure if this would be the high on
the vendor's roadmap.
> - In the following paragraph:
>
> "Where it may be possible to assign a single RD per L3VPN instance, and
> hence achieve some level of route aggregation on BGP speakers within the
> solution, this has some consequences for both convergence in the VPN
> (due to BGP convergence being relied upon) and in its potential to
> exacerbate geographic distance between PE and Route-reflector and is
> therefore undesirable in some circumstances"
>
> Are you referring to multiple-homed CE scenario, where the same NLRI
> will be advertised with different BGP next hops if the same RD is used?
Yes you are correct. For multi-homed sites each advertising same prefix
via different PEs using same RD on both PEs in the corresponding VRFs
will result in the vpnv4 RR making the best path selection across both
paths and advertising only a single best path. Then remote PEs would
receive only one path and could not perform any fast connectivity
restoration techniques (PIC as example).
> - Regarding the maximum number of routes per VRF limit, I think there
> should be some discussion on the PE behavior when the limit is reached,
> e.g.: what do you expect PE-CE and PE-PE protocols should do with routes
> they receive after the limit has been reached? Should these routes be
> stored and installed at the VRF when the route count goes below the
> limit (plus some hysteresis threshold of course)? Or should they also be
> discarded by the PE-CE/PE-PE routing protocols? If discarded, how will
> you resync with the neighbors/peers once the route count goes below the
> limit?
VRF limit is typically reflected by control plane protocols as RIB
install failure. Those routes therefor are not advertised to CEs/RRs,
but are still kept in the BGP table.
In those cases the syslog message raises NOC alarm. I think it is
implementation dependent if they are allowed to be retried to be
installed into VRFs automagically when the limit allows.
No matter how you put it this is a mess for a given VPN and rather VRF
limit was designed to be used as red light protecting other VPNs on a
PE. Perhaps along discussing VRF limit authors should also discuss BGP
prefix limit as proper correlation of both seems helpful.
Best,
R.