Checklist for migrating Proxmox's main storage with ZFS

hosting
server
servers
infrastructure
zfs

The problem¶

I have proxmox running as the main OS for my hpe dl580g9 , and when I set it up I used a spare old 1TB m.2 that was in really good condition. When I set it up I modified the disk settings and changed it from ext4 to ZFS. Trim wasn't on for a long time and I had 40+ simultanious VM's running all pounding on that drive. It was going to fail. No ifs ands or buts, it's a measureable metric. The drive is out now with 99% wearout. When an SSD/m.2 gets to 100% wearout the whole drive becomes read-only.

I obtained a replacement 1TB m.2 from a very generous friend. Now I had to move from the almost dead one to the new one. So much for my 78 day uptime streak for the server. Another generous friend got a second m.2 sled for the new drive. Now I was able to migrate the drives. My initial thought was just dd /dev/driveA /dev/driveB , but it turns out the new drive is actually smaller than the old one. They're both 1TB drives, but the old one was reporting 1.02TB. So a direct dd isn't going to do it.

What's actually involved?

The solution¶

- install the new drive to the server
- create a VM with a proxmox iso (preferably the same version as you're install is running already so the kernels match), do not give it a storage drive.
- If the host uses EFI ensure you make the VM use EFI
- pass new drive to the VM via qm set (vmID) -scsi0 /dev/(newdrive)
- install proxmox to newdrive via the vm, ensure settings are the same as old proxmox. This experience used ZFS.
- shutdown VM (and remove it)
- use filesystem utilities to duplicate existing data
- ZFS Example
- - zfs snapshot existing drive, proxmox uses rpool as the default name for pool:
  - sudo zfs snapshot -r rpool@latest
  - this will create a snapshot of rpool called latest - this name is important for a following step.
- - zfs import drive just configured by now shut off proxmox vm
  - sudo zpool import -d /dev
    - This command shows pools available to import - find (poolid) for the next command with it.
  - sudo zpool import (poolid) rpool2
    - This will import the pool created by the VM to the host machine.
- - Copy the data
  - sudo zfs send -R rpool@latest | sudo zfs recv -F rpool2
    - this command sends the snapshot created in step 6.b.i above and receives it into the new pool (created at step 4 above) imported by 6.c.ii
- - Prepare for old drive removal
  - sudo zfs export rpool2
    - This will detach the pool that's now named rpool2 due to 6.c.ii - if not done it will always drop into initramfs on every boot.
- - Power off machine
  - - Remove old drive
- - Power on machine
  - - If necessary modify primary boot device in bios
  - - System will boot into initramfs
    - - run: (initramfs): zfs import rpool2 rpool
      - This command will rename the pool from rpool2 to rpool. If it asks for a force you need to run 6.e.i (without the sudo) to get rid of the error message saying to run with -f. Do not run -f, while it will work it won't persist across reboots.
      - ALTERNATIVE:
        
        zpool import -d /dev
        
        zpool import (poolid) rpool
        
        This is the same as step 6.c except importing the pool as rpool instead of rpool2
    - - run: (initramfs): exit
      - this will continue to boot the system.
  - - Fix /etc/kernel/proxmox-boot-uuids
    - Remove the old UUID from the file
    - - Find the new UUID
      - ls -lha /dev/disk/by-uuid/*
      - the above command will show which uuid goes to the raw device for the boot partition. It's probably in a simple 4-4 char ¿hex? syntax (EX: A34B-1234)
      - You're not looking for the uuid of the zfs, this is the EFI partition (proxmox by default it's the second partition with the ZFS being the third partition)
    - Place new uuid into the file
  - - Update bootloader
    - - sudo update-initramfs -u -k all
      - Update all the initramfs images with current settings
    - - pve-efiboot-tool refresh
      - Updates proxmox's boot partition
- - Ensure automatic TRIM is enabled if migration target is SSD or m.2
  - zpool set autotrim=on rpool
- Final reboot, ensure all is working as intended.

Results¶

it worked. the system is back up and running, and after following all the steps it reboots like normal and starts all the workload vm's like normal.

Random thoughts¶

i hate rebooting the server. it can take 1-2 hours to restart all the vms and get all the services back up and running. This is a last resort. Before doing this I migrated all the VM's off the m.2 to ceph with spinning rust drives, so all the m.2 is doing is running proxmox (and ceph and some monitoring tools etc). This made the drive last much longer than it would have otherwise. I was down to 2 weeks left of drive life when it was at 92% and managed to extend that to nearly 3 months by moving all the vm's off it.

Noteable shoutouts¶

https://lucatnt.com/2019/11/moving-proxmox-zfs-boot-drive-to-a-new-disk/