Blog

Extended Fast Software Upgrade (xFSU)

Written by IPTel Solutions | 3 September 2024 9:21:02 PM

On the back of our blog regarding Perpetual PoE on the Catalyst 9000 access layer switching platforms, there is yet another very valuable feature you may not be aware of that drastically reduces the disruption to service as a result of a planned outage when performing full image upgrade - xFSU.

xFSU provides a mechanism to upgrade the software image in two parts by separating the control plane and data plane update on a device with a single control plane.

The win? This results in minimizing the downtime on any connected devices - the traffic impact during the complete upgrade is less than 30 seconds.

ASSOCIATED BLOGS:

What is Extended Fast Software Upgrade (xFSU)?

Usually, when you perform a full image upgrade on a Catalyst 9300, both the control plane and date plane portion of the switch reload at the same time in order to activate the new version of software, this results in a traffic outage of between 4 and 6 mins (and even longer for a stack).

xFSU is a software enhancement aimed to reduce traffic downtime during and upgrade operation.

This feature keeps the data plane forwarding traffic, while the control plane is reloading as part of the software upgrade.

The role of xFSU is to reconcile the forwarding state in ASIC with the new control plane and restore the forwarding state after a quick reset of the forwarding ASIC.

The image below visualises this concept:


ASSOCIATED BLOGS:

xFSU Use Case

This interesting experience I had recently is part of the same customer engagement I mentioned in my last Blog on Perpetual PoE.

As part of their network modernisation program, they needed to understand and leverage all the High Availability, resiliency, and redundancy capabilities available to them in order to meet their business requirements centred around minimising any disruption to service as a result of either “Planned Events” or “Unplanned Events”.

The network solution was built around Cisco Catalyst 9000 switching platforms, which luckily for our customer, has many resiliency and redundancy features built into the hardware that they were not aware of - which is always good news!

Continuing on with this experience, let's take a look at another really cool high availability feature that’s built into the Cat 9300s - xFSU.

ASSOCIATED BLOGS:

Enabling xFSU

Worth noting, that even without performing and image upgrade, the Cisco C9Ks do support an option for “fast reload” that greatly reduces the reload time.

If you are unsure whether your specific platform supports xFSU or Fast-Reload, you can issue the following CLI command:

show xfsu eligibility

There are some limits to xFSU that you should be aware of, so always consult the latest software configuration guide to become aware of these.

One that I’ll mention here is that, even though the data plane resumes forwarding traffic within 30 seconds, there will still might be an issue of routing protocol convergence (in some scenarios) in order to forward traffic toward the destination, therefore, NSF and GR (non-stop forwarding / Graceful Restart) should be enabled on the routing protocol.

To perform an xFSU software upgrade, use the following CLI commands:

install add file bootflash: <image filename> activate xfsu commit

And if you even need to reload the switch for any reason other than an image upgrade, instead of using the normal “reload” command, you can issue the “reload fast” instead!

ASSOCIATED BLOGS:

xFSU: Summary

The combination of xFSU and Fast Reload becomes really important for business continuity as organisations are finding it increasingly difficult to schedule outage-windows and get business approval to have a disruption to service that may last up to 10 minutes in some cases.

Cisco has built this feature into the latest generation of switches, with the Catalyst 9k series supporting this (check per model).

Coupled with the Perpetual POE feature, xFSU has meets the needs of businesses which need high uptime is to really reduce and minimise the impact of outages - POE devices will not reboot, the switch reboot is faster and so your risk is lower and the time it takes to get your devices back online a real win.

ASSOCIATED BLOGS: