EVPN-VXLAN Data Centers: Architecture

A friend of mine, Phil Gervasi, posted on LinkedIn a great meme not long ago (as of the time of this writing) and it got me thinking about bridging in the data center. A comatose patient wakes up when the doctors suggest Layer-2 traffic should be extended between data centers. I had to chuckle, because back in my early days as a network engineer, extended L2 over a metro-area network was just begging for trouble. The concern wasn’t so much the traffic pattern as much as the blast radius if something went SNAFU.

In the modern data center, we have EVPN-VXLAN technology and can control the exposure to adverse events on the LAN. If you’ve not investigated the options, it can be quite a learning curve. You’ll see discussions of topologies like Bridged Overlay, Central Route Bridging (CRB), and Edge Route Bridging (ERB). It’s these first items that I’ll address here. To be fully transparent, I am much more agreeable to the ERB design than the CRB, but there are times and places for the nuance. Let’s look at these topologies and cover some of the benefits, limitations, and why one may be preferred over another.

Bridged Overlay Design

Bridge Overlay

A bridged overlay extends the L2 broadcast domains across routed links, the data center fabric, to other network devices. In most networks, you want to route your host traffic to some other location. This means your default gateways need to be somewhere in your network. With BO topology, there is no option to route traffic between VLANs within the data center fabric. A firewall or other upstream router would handle those needs.

If you’re deploying a BO topology you’re extending L2 frames from individual hosts between disparate racks, across an L3 link. This is how the data center has changed from the older core-distribution-access design using L2-trunks. The modern data center links all leaves to a spine and ideally uses an IP fabric rather than a multi-chassis LAG deployments.

Bridged overlay makes sense when you have something like network host sprawl in your DC or some hesitancy to jump all in on EVPN-VXLAN. For example, you have run out of rack space for certain functions (servers handling RADIUS, SCP, DNS functions) but they all live on the same VLAN. With a bridge overlay, you can extend that VLAN across your data center hardware even if those devices are separated by layer-3 links. But what is critically important is that there is no default gateway on any of the switches in your DC fabric. It’s strictly L2 segmentation.

Central Route Bridging Design

Central Route Bridging

In CRB and ERB topologies the default gateways for your hosts are brought into the data center fabric rather than punting inter-VLAN routing to an external device as you would in BO. CRB sets your Integrated Route Bridging (IRB) or Switched Virtual Interface (SVI) configurations onto the spine. All VXLAN Tunnel Endpoints (VTEP) are sourced and terminated on the spine switches and connections to the leaves are handled through an 802.1q trunk.

With this topology, BLUE hosts will be routed off the L2-domain defined on the leaves via the spine so they can connect to the YELLOW and ORANGE hosts. Data center managers and network engineers will need to manage VLAN placement on appropriate leaves and trunks.

CRB topology has some benefits of simplicity and is better suited for data center traffic patterns with north-south flows or when leaf switches can’t support EVPN-VXLAN services. In some instances CRB can bring cost savings since EVPN-VXLAN functionality is a normally a licensed model. Any switch terminating a VTEP will need an upgraded license from the manufacturer. If your organization is modernizing the data center network topology, cost may be a factor. It’s less expensive to license two switches at the spine, rather than 50 switches at the leaf.

Be aware that this type of deployment increases the blast radius in your data center in the event of a control plane issue. The control plane is contained by two devices for the entire fabric and a “resume generating event” will impact the entire data center. All leaves and hosts may have problems depending on the type of failure at the spine. Network operations teams will need to take the additional measures for monitoring and alerting.

Edge Route Bridging Design

Edge Route Bridging

Now to my favorite option. Edge Route Bridging.

In this deployment the IRB, overlay announcements, and VTEPs are pushed deeper into the data center fabric. The leaf is where bridging and routing logic is most relevant for host connectivity, so this is an ideal option for east-west traffic flow. DCs that use pods, larger server rack counts, have multi-tenancy requirements, and inter-Data Center connections can see great benefits in traffic flow when using the ERB topology.

Risk is reduced by putting the EVPN-VXLAN logic at the top of rack rather than at the very center of the data center. Consider the control plane in ERB compared to CRB. Control plane issues at the leaf are normally limited to a rack of servers rather than the entire data center path. If there are issues at the leaf, the convergence of MAC addresses is handled in the routing protocol rather than flooding through the data center. This isolates the computation on the network infrastructure.

I’d also argue that you’ll recognize this if you come from the old core-distribution-access layer design days. ERB frees up your spines to perform IP-only functions rather than dealing with L2 BUM traffic, bridging, and the additional route tables for EVPN-family signaling. This improves convergence on both the VXLAN side as well as the use of multiple connections between the leaf and spine.

In CRB you have to use a L2 trunk to get multiple VLANs from the leaf to the spine. That means you’re dealing with spanning-tree, or vendor-specific MCLAG deployments. Why would you do that? If you can extend the IP fabric to the leaf and leverage Equal Cost Multi-Path (ECMP) then your traffic can use all links, rather than an active-passive link.

EVPN type-2 routes contain a MAC and optional IP address in the packet. When a host goes offline, Ethernet will flood a broadcast to learn where the host lives. With CRB, that flood extends to the spine but in ERB, the flood is localized to the VLANs associated with the request. The host MAC address learned at the leaf then announced in an MP-BGP route exchange. BGP neighbors and VTEP pairs will receive the host information by route update rather than a broadcast. This keeps your spines lean and they are used for IP transit for the underlay and overlay.

The benefits to the data center are simple.

  • Reduced flooding and learning by way of EVPN signaling. This is important as the data center expands. Your network is learning updates in the control plane rather than the data plane.
  • Traffic convergence is occurring in the control plane making L2/L3 learning more efficient. Using technologies like Bidirectional Forward Detection (BFD) and/or Unidirectional Link Detection (UDLD) will speed up failure notification and convergence times.

Lastly, in the love-this-technology-bucket, you’ll have a standard deployment across your campus. The same technology that handles edge-terminated VTEPs can also handle your campus fabric deployment. Support staff will not need to learn a new, or one-off, network deployment. Even your automated systems may be leveraged without new code to support a different network topology.

More companies are deploying EVPN-VXLAN to extend layer-2 domains across the campus to support services like WAPs, authentication systems, user roaming to an isolated VLAN or subnet. The ERB topology is easily extended to the furthest closet of the network.

Closing

While I favor ERB data center designs, it’s very important to understand that as a network engineer, you must be flexible. CRB is a good entry if you’ve never deployed EVPN-VXLAN or you may have some “political” issues to overcome. ERB is really good for expansion to support data center needs as well as services across the campus. BO is kind of niche, but, it could fit support needs for your DMZ, as an example.

These three designs certainly give you options to improve scalability, performance, and supportability within your data center and campus environments. Choose wisely the type that fits your business and application needs.