Proxmox VE Hyperconverged Cluster Configuration Update Without Service Interruption


A five-node Proxmox VE hyperconverged cluster has been set up with two Ceph pools. One is a high-speed NVMe storage pool, and the other is a large-capacity SATA storage pool. Now, the existing SATA disks need to be removed and replaced with high-speed NVMe disks.

First, destroy the “hdd_pool” made up of SATA mechanical disks. Select it, then click the “Destroy” button.

Note: You must destroy the Ceph pool first, and then destroy the OSD disks that make up the Ceph pool. If the order is reversed, during the process of destroying the Ceph OSD, the remaining OSDs will constantly rebalance the data. If the Ceph cluster cannot maintain the minimum required number of disks, the system will report an error, which could be troublesome.

Next, destroy the Ceph OSD. This step is also necessary. Otherwise, after removing the hard drive and rebooting, some issues will remain, causing discomfort (for those with OCD). The process of destroying the OSD disks that make up the “hdd_pool” Ceph pool consists of three smaller steps: OSD Out, OSD Down, and OSD Destroy.

Step 1: Take the OSD disk offline. Select the OSD disk you want to take offline and click the “Out” button in the upper right corner of the Proxmox VE cluster web management interface.

Step 2: Stop the OSD disk. Select the OSD disk that is in the “Out” state, and click the “Stop” button in the upper right corner of the Proxmox VE cluster management interface.

To ensure the operation is correct, it is best to confirm whether the Ceph cluster is rebalancing the OSD data. You can check in the Proxmox VE cluster web management interface or use the command line “ceph health detail” on any cluster node. If using the web graphical interface, a normal status should show all green.

Step 3: Destroy the OSD disk. Select the OSD disk that is in both the “Down” and “Out” state, click the “More” button in the upper right corner, and then click “Destroy” in the submenu.

Follow the above three steps to take all the mechanical OSD disks offline and destroy them. In addition to the graphical interface, you can also use the command line.

Shut down any physical server in the cluster, remove all SATA hard drives, and insert new high-speed NVMe disks. After shutting down, all the virtual machines running on that node will automatically migrate to other nodes.

Once the server with the new disks is powered on, the newly inserted NVMe disks will be recognized by Proxmox VE. Then continue with the web management interface to perform the operation of creating OSD for individual disks.

If this step is not handled, when creating the OSD in the next step, a prompt will indicate that no usable hard disk is available.

Switch to the Proxmox VE hyperconverged cluster web management interface, select the node where the new disk was just inserted, and click the “Create OSD” button in the upper left corner of the interface. A small window for creating the OSD will pop up. Select the new blank device, then in the device type drop-down list, select “NVMe,” and click the “Create” button.

Repeat this process to create the remaining OSDs. Then observe the change in overall capacity of the Ceph pool compared to before.

Repeat all of the above steps for each of the five nodes in the cluster. Since everything was well considered, the entire process went very smoothly.


Leave a Reply

Your email address will not be published. Required fields are marked *