Essinghigh.Dev - ZFS: Migration from Mirrors to RAIDZ

Background

My existing ZFS pool consists of six mirrored vdevs, each with two 14TB disks. This gives me six disks worth of redundancy (or one disk per vdev). If I were to lose two disks within the same vdev, I would lose the entire pool. However, this is a recommended "safe" configuration as it generally provides a good amount of redundancy and is performant in terms of IOPS.

I'm going to be moving to two RAIDZ vdevs, each with six 14TB disks. This will give me one disk worth of redundancy per vdev. This has quite a bit less redundancy than the current setup, but I'm always physically close to the server and can have a number of cold-spare disks on hand. I'm expecting this to take me (extremely roughly) from 6 disks worth of usable space to 10 disks worth of usable space, which is a significant increase.

My chassis only supports 12 disks, so I do not expect to be expanding the number of disks outward, but I may replace the disks with larger ones in the future. Because of this I'd like to keep the VDEVs smaller-width, as opposed to using something like a RAIDZ2/3 vdev with all twelve disks, as I'd need to then replace all twelve disks with larger ones to increase the pool size. With two RAIDZ vdevs, I only need to increase the size of one vdev (six disks) to increase the total pool size, which is much more manageable.

The Plan

As mirrors can't be converted directly to RAIDZ, the plan is to remove three of the mirror vdevs (totalling 6 disks) from the existing pool, then create a new pool with one RAIDZ vdev using those six disks. Once the new pool is created, I will replicate the data from the old pool to the new pool. After that has been done and I've tested the new pool, I will destroy the old pool and add the remaining six disks to the new pool, creating a second RAIDZ vdev.

As I use TrueNAS, I'll remove the mirrors using the WebUI just to make sure the middleware is properly aware of the changes (from my testing, I should be able to use CLI no issue, however I'd rather be safe than sorry when doing pool-wide operations like this). Then, I'll create the new pool using the WebUI. Once the new pool is created, I'll use the CLI to snapshot the datasets in the old pool, then replicate the snapshot to the new pool. This will look something like this:

root@truenas[~] zfs snapshot -r data@raidz_migration
root@truenas[~] zfs send -Rnv data@raidz_migration | grep total
total estimated size is 19.7T
root@truenas[~] zfs send -R data@raidz_migration | pv | zfs receive -F data_new
250GiB 0:10:00 [ 485MiB/s] [ <=>          ]

Once the replication is complete, I'll rename the pools like so:

WebUI -> Export both pools
root@truenas[~] zpool import data data_old
root@truenas[~] zpool import data_new data
root@truenas[~] zpool export data_old
root@truenas[~] zpool export data
WebUI -> Import data
WebUI -> Import data_old

Now the new pool RAIDZ pool will have the name data, and the old pool will have the name data_old. I can then destroy the old pool via the TrueNAS WebUI, then add the remaining six disks to the new pool as a second RAIDZ vdev.

Then, I'll need to rebalance the pool, as all data will still be on the first vdev. There isn't an official way to do this, but I've had success with zfs-inplace-rebalancing. This will run through the pool and copy each file, giving it a .balance suffix, then the original file will be deleted, and the .balance suffix will be removed. This will cause the newly written data to be distributed across both vdevs, which is what we want. However - as ZFS write distribution is quite complex, the distribution process will likely not be perfect, and I may need to give it a couple of passes. This will take a long time.

I expect each vdev removal to take around four to six hours, and the replication to take around nine to thirteen hours, as for the rebalancing, I'm not sure how long it will take, most likely at least twelve hours, but it could be much longer. Luckily, I decided to start this on a bank holiday weekend, so I've got three days to work with.

2025-05-25 12:56

I completed the vdev removals and pool creation as I finished writing this post, so I've just kicked off the replication task. So far all has gone smoothly, and I've transferred 502GiB in the first 19 minutes. Assuming the same rate of transfer, I should be done in about 13 hours.

2025-05-25 21:36

Replication is going pretty smoothly, it's been 7 hours and 40 minutes, and I have transferred 11.4TiB so far at an average of 431MiB/s and am estimating somewhere around 5 hours until completion.

2025-05-26 10:39

I fell asleep waiting for the replication to finish, it probably completed about 20 minutes after I fell asleep. I exported both pools, imported them with their new names, and then exported them again. I then imported the new pool via the TrueNAS WebUI, and added remaining six disks to the new pool as a second RAIDZ vdev (destroying the old pool in the process).

root@truenas[~]# zpool iostat data -v | grep -E "alloc|raidz|data"
pool                                      alloc   free   read  write   read  write
data                                      24.5T   127T      4     78  82.4K   597K
  raidz1-0                                24.5T  51.9T      4     57  82.2K   437K
  raidz1-1                                8.45M  74.8T      0     37    406   286K

The allocated space is entirely on the first vdev, as expected. Now I need to rebalance the pool, which will take an unknown amount of time (though it looks to be going faster than I'd expected). I have kicked this off and am waiting for it to complete.

Progress -- Files: 3877/149871 (2.58%)

2025-05-26 15:57

Progress -- Files: 77251/149871 (51.54%)

2025-05-27 09:33

Finally complete, it finished while I was asleep, probably sometime around 5:00, so the rebalancing took about 19 hours.

root@truenas[~]# zpool iostat data -v | grep -E "alloc|raidz|data"
pool                                      alloc   free   read  write   read  write
data                                      23.4T   128T  1.33K    728   288M   289M
  raidz1-0                                8.73T  67.6T  1.33K    269   288M   108M
  raidz1-1                                14.7T  60.1T      0    460     77   182M

As I expected, the allocation is still not balanced nicely, so I'll need to rerun the rebalancing script again. This will probably take another 19 hours, unless I figure out which files have not been properly balanced and run the script against them specifically.