As we all enter the third month of the COVID-19 pandemic and look for new projects to stay engaged (read: sane), may we interest you in learning the basics of computer storage? Quietly this spring, we've already gone over some necessary basics, like how to test the speed of your disks and what RAID is. In the second of those stories, we even promised a follow-up that explores the performance of various multi-disk topologies on ZFS, the next-generation file system you've heard about due to its appearance everywhere from Apple to Ubuntu.

Well, today is the day to explore, curious ZFS readers. Just know ahead of time that, in the words of OpenZFS developer Matt Ahrens, "it's really complicated."

But before I get to the numbers, and they are coming, I promise! For all the ways you can shape ZFS on eight disks, we have to talk about how ZFS stores its data on disk first.

Zpools, vdevs and devices

This complete group diagram includes one of each of the three supporting vdev classes and four RAIDz2 storage vdevs.

Generally, you wouldn't want to do a "mutt,quot; group of mismatched vdev types and sizes, but nothing will stop you, if that's what you want to do.

To really understand ZFS, you need to pay real attention to its actual structure. ZFS combines traditional layers of file system and volume management, and uses a transactional copy-on-write mechanism; They both mean that the system is very structurally different from conventional file systems and RAID arrays. The first set of main building blocks to understand are zpools , vdevs and devices .

zpool

the zpool it is the top structure of ZFS. A zpool contains one or more vdevs , each of which contains one or more devices . Zpools are autonomous units: a physical computer can have two or more separate zpools, but each one is completely independent of the others. Zpools cannot share vdevs one with the other.

ZFS redundancy is on the vdev level not the zpool level. There is absolutely No redundancy at the zpool level, if there is storage vdev or SPECIAL vdev is lost, all the zpool you lose with it.

Modern zpools can survive the loss of a CACHE or LOG IN vdev: Although they may lose a small amount of dirty data, if they lose a LOG IN vdev during a power outage or system crash.

It is a common mistake to think that ZFS "stripes,quot; write across the group, but this is incorrect. A zpool is not a funny looking RAID0, it is a funny looking JBOD, with a complex distribution mechanism subject to change.

For the most part, the writes are distributed among the available vdevs according to their available free space, so that all the vdevs are theoretically filled at the same time. In newer versions of ZFS, the use of vdev can also be taken into account: if one vdev is significantly busier than another (for example, due to read load), it can be temporarily skipped for writing despite having the higher proportion of free space available.

The utilization awareness mechanism built into modern ZFS write distribution methods can decrease latency and increase performance during periods of unusually high load, but should not be confused with White card to unintentionally mix slow oxide disks and fast SSDs in the same group. Such a mismatched group will generally function as if it were completely made up of the slowest device present.

vdev

Every zpool consists of one or more vdevs (short for virtual device). Each vdev, in turn, consists of one or more devices . Most vdevs are used for simple storage, but there are also several special vdev support classes, including CACHE , LOG IN and SPECIAL. Each of these vdev types can offer one of five topologies: single device, RAIDz1, RAIDz2, RAIDz3, or mirror.

RAIDz1, RAIDz2 and RAIDz3 are special varieties of what storage grays call "diagonal parity RAID,quot;. 1, 2, and 3 refer to how many parity blocks are assigned to each data band. Rather than having entire disks dedicated to parity, RAIDz vdevs distributes that parity evenly across disks. A RAIDz array can lose as many disks as parity blocks; if he loses another, he fails and takes the zpool down with that.

Mirror vdevs are exactly what they seem: In a mirror vdev, each block is stored on each device in the vdev. Although two-width mirrors are the most common, a vdev mirror can hold any arbitrary number of devices; all three lanes are common in larger configurations for increased read performance and fault resistance. A mirror vdev can survive any failure as long as at least one device in the vdev remains healthy.

Single-device vdevs are also exactly what they sound like, and are inherently dangerous. A single device vdev cannot survive any failure, and if it is being used as storage or SPECIAL vdev, your failure will take everything zpool down with that. Be very, very careful here.