ZFS pool (Zpool) is a collection of one or more virtual devices (vdevs), vdev is a group of physical disks. They have following facts.
- The redundancy level for vdevs can be a single drive, mirror, RAID-Z1, RAID-Z2, and RAID-Z3.
- After creating a Zpool, it may not be possible to add additional disks to the vdev except mirrors.
- Add additional vdevs to expand the Zpool is possible.
- The storage space allocated to the Zpool cannot be decreased.
- The drives in vdevs that are parts of the Zpool can be exchanged.
If there is a need to change the layout of the Zpool, the data should be backed up and the Zpool destroyed.
Datasets is the space emulating a regular file system.
Datasets can be nested, which can possess different settings for snapshots, compression, deduplication and so on.
Volumes (zvols) is the space emulating a block devices.
The copy-on-write mechanism is to keep old data on the disk.
Checksum information is written when data is written into disk, then verified when read data from disk. When checksum mismatch detected, use redundant data is used for correction.
Different checksum algorithms are used
- Fletcher-based checksum
- SHA-256 hash
- Single - Zpool has a vdev consisting of a single disk, similar to RAID0.
- Mirror – similar to RAID1.
- RAIDZ1 – similar to RAID5 but without the write hole issue.
- RAIDZ2 – similar to RAID6, with 2 disks redundancy.
- RAIDZ3 – similar to RAID6, with 3 disks redundancy.
RAID write hole in a RAID5/RAID1 occurs when one of the member disks doesn't match the others and by the nature of single-redundant RAID5/RAID1 it is impossible to tell which of the disks is bad.
ZFS is a self-healing system. If mismatched checksum is detected, ZFS tries to retrieve the data from other disks. If data correct, the system will amend the incorrect data and checksum.
If a disk in a Zpool fails, the pool is set to the degraded state, then data on the failed device is calculated and written to first the spare disk replaces the failed one. This is called resilvering. Once the restoration operation is complete, the status of the Zpool changes back to online. In case of when multiple disks have failed and if there are not enough redundant devices, the Zpool changes its state into unavailable.
Migrate to different system
In old system, export zpool, which unmounts Zpool’s datasets or zvols.
In new system, import zpool, which mount Zpool's datasets or zvols.
The scrubbing is consistency check operation, and try to repair corrupted data.
There is no online defragmentation in ZFS, so try to keep zpools below 70% utilization instead.
On ZFS, the data changes are stored on a different location than the original location on a disk and then the metadata is updated in that place on the disk. This mechanism guarantees that the old data is safely preserved in case of power loss or system crash that in other cases would result in loss of data.
The snapshot contains information about the original version of the file system to be retained. Snapshots do not require additional disk space within the pool. Once the data rendered in a snapshot is modified, the snapshot will take the disk space since it will now be pointing to the old data.
The clone is a writeable version of a snapshot. Overwriting the blocks in the cloned volume or file system results in decrementing the reference count on the previous block. The original snapshot that the clone is depending on, can not be deleted.
Rollback command is to go back to a previous version of a dataset or a volume. Note that the rollback command cannot revert changes from other snapshots than the most recent one. If to do so, all intermediate snapshots will be automatically destroyed.
Promote command is to replace an existing volume with its clone.
ZFS Essentials – What is pooled storage?
ZFS Essentials – Copy-on-write & snapshots
ZFS Essentials – Data integrity & RAIDZ
RAID Recovery Guide