This article provides you with the steps to update the Nutanix cluster's firmware by putting each node into maintenance, before rebooting one node at a time in rescue mode.
Our services will take over to apply updates and firmware and will restart the node thereafter.
This guide explains how to update your Nutanix cluster firmware.
- A Nutanix cluster in your OVHcloud account
- Access to the OVHcloud Control Panel
- Consulting the guide First Steps with the OVHcloud API (to familiarize yourself with the OVHcloud API)
Before any other action, log in to your Prism Element interface and perform the following tasks:
- Check that the cluster's "Data Resiliency Status" is
This can be verified on the main dashboard of your Prism Element interface:
- Run an NCC check.
In the Prism Element interface, click
Health from the main menu.
Actions to the right and click
Run NCC Checks.
All checks and click
A log file called
/home/nutanix/data/logs/ncc-output-latest.log will be generated at the end of the checks.
Please analyze it carefully. If you find errors or failures in the cluster or service state, do not continue and contact OVHcloud support.
ncc health_checks run_all
Enabling maintenance mode
Nodes will be updated one by one, the Nutanix cluster will continue to work properly.
To log in to CVM, you can launch IPMI from your OVHcloud Control Panel or use a terminal.
Connect to CVM
At the login prompt, log in with root credentials to access the host terminal.
Then open an SSH connection to any CVM with Nutanix credentials to access the CVM terminal.
Check nodes state
Once logged in, check that:
Node statestatus is set to
Schedulablecolumn is set to
Truefor all nodes.
Then run the following command to check:
If all checks are OK, you need to check that the current host state can be changed to
Maintenance. To do so, use the following command:
acli host.enter_maintenance_mode_check <Hypervisor_IP>
Put a node in maintenance mode
If all hosts are eligible to enter maintenance mode, put the first host into maintenance mode with the following command:
acli host.enter_maintenance_mode 192.168.0.1 wait=true
Shut down the CVM
Once the host is in maintenance mode, CVM can be shut down with the following command:
cvm_shutdown -P now
With root credentials, open a terminal on the node that hosts the CVM and confirm that the CVM is stopped:
virsh list --all
On the main dashboard, the "Data Resiliency Status" will become
Critical, the cluster is now running with two nodes.
The CVM is now shut down.
Reboot to rescue mode
Log in to the OVHcloud Control Panel, go to the
Hosted Private Cloud, choose the
Nutanix solution, and select your cluster.
Identify the node to boot in rescue mode by using the following OVHcloud API call:
serviceName: Enter the cluster name.
You can then identify your node name:
Once you have retrieved the name of the node to reboot in rescue mode, select this node in your OVHcloud Control Panel.
Boot section, click the
... button then click
Change the netboot by choosing
Boot in rescue mode, choose the
rescue-customer version, and click
Confirm your choice.
Once confirmed, a green message will confirm that the netboot has been updated.
Click again the
... button and click
The server will reboot. Optionally, you can open an IPMI session to follow the reboot process of your node.
When the node is booted into
rescue-customer, update your support ticket with this information to notify the OVHcloud support teams that they can proceed with the firmware update.
Our support teams will finish the necessary updates, meaning they will:
- Restart the node on the local disk, which will start the Nutanix system and the CVM automatically.
- Update the ticket to let you know when the node can exit from maintenance mode.
At this time, the node will be up and running. Follow the next step to exit maintenance mode.
Exit from maintenance mode
After updating the node, our services will reboot the node from its local disk. The Nutanix software will load AOS and the CVM will automatically start.
Once the system is up and running, log in to the CVM and run the following command:
As you can see in the output example below, the first node is still in maintenance mode.
To remove the node from maintenance mode, run the following command:
The host exits the
Maintenance state and goes back to the
Migrated VMs from this node automatically move from other nodes to it.
On the main dashboard, the "Data Resiliency Status" will revert to
OK. The cluster also returns to its normal state.
Proceed with the remaining nodes one at a time with the same steps.
Please do not open a new ticket, just add comments on the same ticket for each node, specifying the name of the server (example: