Sebastian described an fascinating Cisco ACI quirk they’d the privilege of chasing round:
We’ve encountered VM connectivity points after VM actions from one vPC leaf pair to a distinct vPC leaf pair with ACI. The difficulty didn’t happen instantly (as a result of ACI’s bounce entries) and solely typically, which made it very troublesome to breed synthetically, however as a result of DRS and numerous VMs it occurred incessantly sufficient, that it was a major problem for us.
Right here’s what they discovered:
The issue was, that typically the COOP database entry (ACI’s separate management airplane for MACs and host addresses) was not up to date accurately to level to the brand new leaf pair.
That undoubtedly feels like a bug, and Erik talked about in a later remark that it was most likely mounted within the meantime. Nonetheless, the enjoyable half was that issues labored for nearly 10 minutes after the VM migration:
After the bounce entry on the previous leaf pair expired (630 seconds by default), visitors to the VM was principally blackholed, since distant endpoint studying is disabled on border leafs and at all times forwarded to the spines underlay IP deal with for proxying.
A bounce entry appears to be one thing like MPLS/VPN PIC Edge – the unique swap is aware of the place the MAC deal with has moved to, and redirects the visitors to the brand new location. Simply having that performance makes me fearful – opposite to MPLS/VPN networks the place you would have a number of paths to the identical prefix (and thus know the backup path prematurely), you want a bounce entry for a MAC deal with solely when:
- The unique edge system is aware of the brand new swap the moved MAC deal with is hooked up to
- Different material members haven’t realized that but.
- The interim state persists lengthy sufficient to be value the additional effort.
Anyway, the group dealing with that drawback determined to “resolve” it by limiting VM migration to a single vPC pair:
Ultimately we gave up and restricted the VM migration area to a single VPC leaf pair. VMware recommends a most variety of 64 hosts per cluster anyway.
Having high-availability vSphere clusters and greater than two leaf switches, and limiting the HA area to a single pair of leafs, undoubtedly degrades the resilience of the general structure, until they determined to restrict DRS (computerized VM migrations) to a subset of cluster nodes with VM affinity whereas retaining the advantages of getting the high-availability cluster stretched throughout a number of leaf pairs. It’s unhappy that one has to go down such paths to keep away from vendor bugs attributable to an excessive amount of pointless complexity.
Wish to Know Extra About Cisco ACI? Cisco ACI Introduction and Cisco ACI Deep Dive
Webinars are ready for you 😉