ZFS Concept

Pool

ZFS pool (Zpool) is a collection of one or more virtual devices (vdevs), vdev is a group of physical disks. They have following facts.

The redundancy level for vdevs can be a single drive, mirror, RAID-Z1, RAID-Z2, and RAID-Z3.
After creating a Zpool, it may not be possible to add additional disks to the vdev except mirrors.
Add additional vdevs to expand the Zpool is possible.
The storage space allocated to the Zpool cannot be decreased.
The drives in vdevs that are parts of the Zpool can be exchanged.

If there is a need to change the layout of the Zpool, the data should be backed up and the Zpool destroyed.

Datasets

Datasets is the space emulating a regular file system.

Datasets can be nested, which can possess different settings for snapshots, compression, deduplication and so on.

Volumes

Volumes (zvols) is the space emulating a block devices.

Data Integrity

No overwritten

The copy-on-write mechanism is to keep old data on the disk.

Checksum

Checksum information is written when data is written into disk, then verified when read data from disk. When checksum mismatch detected, use redundant data is used for correction.

Different checksum algorithms are used

Fletcher-based checksum
SHA-256 hash

ZFS RAID

Single - Zpool has a vdev consisting of a single disk, similar to RAID0.
Mirror – similar to RAID1.
RAIDZ1 – similar to RAID5 but without the write hole issue.
RAIDZ2 – similar to RAID6, with 2 disks redundancy.
RAIDZ3 – similar to RAID6, with 3 disks redundancy.

RAID write hole in a RAID5/RAID1 occurs when one of the member disks doesn't match the others and by the nature of single-redundant RAID5/RAID1 it is impossible to tell which of the disks is bad.

Errors

Checksum mismatch

ZFS is a self-healing system. If mismatched checksum is detected, ZFS tries to retrieve the data from other disks. If data correct, the system will amend the incorrect data and checksum.

Disk failure

If a disk in a Zpool fails, the pool is set to the degraded state, then data on the failed device is calculated and written to first the spare disk replaces the failed one. This is called resilvering. Once the restoration operation is complete, the status of the Zpool changes back to online. In case of when multiple disks have failed and if there are not enough redundant devices, the Zpool changes its state into unavailable.

Migrate to different system

In old system, export zpool, which unmounts Zpool’s datasets or zvols.

In new system, import zpool, which mount Zpool's datasets or zvols.

Maintenance

Scrubbing

The scrubbing is consistency check operation, and try to repair corrupted data.

No defragmentation

There is no online defragmentation in ZFS, so try to keep zpools below 70% utilization instead.

Copy-on-write

On ZFS, the data changes are stored on a different location than the original location on a disk and then the metadata is updated in that place on the disk. This mechanism guarantees that the old data is safely preserved in case of power loss or system crash that in other cases would result in loss of data.

Snapshots

The snapshot contains information about the original version of the file system to be retained. Snapshots do not require additional disk space within the pool. Once the data rendered in a snapshot is modified, the snapshot will take the disk space since it will now be pointing to the old data.

Clones

The clone is a writeable version of a snapshot. Overwriting the blocks in the cloned volume or file system results in decrementing the reference count on the previous block. The original snapshot that the clone is depending on, can not be deleted.

Rollback

Rollback command is to go back to a previous version of a dataset or a volume. Note that the rollback command cannot revert changes from other snapshots than the most recent one. If to do so, all intermediate snapshots will be automatically destroyed.

Promote

Promote command is to replace an existing volume with its clone.

References

ZFS Essentials – What is pooled storage?
ZFS Essentials – Copy-on-write & snapshots
ZFS Essentials – Data integrity & RAIDZ
RAID Recovery Guide

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

ZFS Concept

ZFS Concept

Pool

Datasets

Volumes

Data Integrity

No overwritten

Checksum

ZFS RAID

Errors

Checksum mismatch

Disk failure

Migrate to different system

Maintenance

Scrubbing

No defragmentation

Copy-on-write

Snapshots

Clones

Rollback

Promote

References

Leave a Reply Cancel Reply