Purpose of Creating Multiple Ceph Pools
In a Proxmox VE cluster, virtual machines run on high-speed NVMe disks, while the associated data for these virtual machines is designated to lower-speed, cheaper, and larger-capacity disk spaces. For high availability and to fully utilize resources, it is necessary to ensure that all data, except for the Proxmox VE host, runs on Ceph distributed storage.
Solution Approach
Create one Ceph Pool using high-speed NVMe (or other SSD) multi-disks and another Ceph Pool using large-capacity multi-disks.
Test Environment
Three physical servers, each with one or more NVMe disks (capacity varies but does not affect usage and performance) and several SATA low-speed hard drives.
Platform version is Proxmox VE 7.0 with Ceph installed, version 16.2.6.
The server cluster has been set up, and all disks to be used have been initialized.
Steps of the Experiment
The process can be roughly divided into: creating different types of Ceph OSDs, creating Ceph crush rules, creating Ceph Pools, creating virtual machines, assigning disk space to virtual machines, and functional verification.
Creating Different Types of Ceph OSDs
1. Creating SATA Disk Ceph OSDs: Set the device type to HDD. Log into the Proxmox VE web management interface, select the physical node, and create OSDs.
Repeat this operation to create OSDs for all remaining available SATA disks.
2. Creating NVMe Disk Ceph OSDs: Set the device type to NVMe.
Repeat this operation to create OSDs for all remaining NVMe disks.
3. Verifying the Created Ceph OSDs: Run the command `ceph osd tree` on the Proxmox VE host system Debian to view the generated Ceph OSDs, as shown below:
root@pve3:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 9.18233 root default
-7 3.68178 host pve1
0 hdd 1.81879 osd.0 up 1.00000 1.00000
5 nvme 1.86299 osd.5 up 1.00000 1.00000
-3 2.75027 host pve2
1 hdd 1.81879 osd.1 up 1.00000 1.00000
4 nvme 0.93149 osd.4 up 1.00000 1.00000
-5 2.75027 host pve3
2 hdd 1.81879 osd.2 up 1.00000 1.00000
3 nvme 0.93149 osd.3 up 1.00000 1.00000
Creating Ceph OSD Crush Rules
Create two Ceph OSD Crush Rules: one to identify NVMe, and the other to identify HDD. Creating Crush rules cannot be done via the Proxmox VE web management interface, so it must be done manually on the host system Debian command line with the following commands:
ceph osd crush rule create-replicated rule-nvme default host nvme
ceph osd crush rule create-replicated rule-hdd default host hdd
After executing these commands, run the following command to verify:
root@pve3:~# ceph osd crush rule ls
replicated_rule
rule-nvme
rule-hdd
Creating Ceph Pools
Switch to the Proxmox VE web management interface, select a node, and create the Ceph Pools. Here, two Ceph Pools are created to categorize the different OSDs.
1. Creating NVMe Disk Pool: Name the pool “NVMe_pool,” select the “Crush Rule” as “nvme-rule,” and click the “Create” button.
2. Creating SATA Disk Pool: Name the pool “HDD_pool,” select the “Crush Rule” as “hdd-rule,” and click the “Create” button.
3. Validating the Ceph Pool: Perform a “Volume Migration” operation in the Proxmox VE web management interface.
Switch to the Proxmox VE host system Debian command line and execute `ceph osd pool stats` to check the output.
Functionality Testing
Create a virtual machine in the Proxmox VE cluster, specifying the disk storage as “NVMe-pool.”
After creating the virtual machine and installing the operating system (CentOS was used for testing), add a new disk to the virtual machine and specify the storage as “HDD_pool.”
Enter the virtual machine running CentOS, create a file system on the allocated disk, mount it, and manually create some files or directories.
Continue in the Proxmox VE web management interface and add the virtual machine or container to the high-availability cluster.
Shut down the physical server where the newly created and configured virtual machine resides, and check the running status from the Proxmox VE web management interface. When the node goes offline, check whether the virtual machine has migrated.
After a few minutes of waiting, the virtual machine and container successfully migrate to the operational physical node.
Leave a Reply