Tag: raid

Btrfs Basic

Btrfs Basic

Status

btrfs device states /app
btrfs fi show /app

Convert raid

Convert to raid0 and remove one disk

btrfs balance start -f -sconvert=single -mconvert=single -dconvert=single /app
btrfs device remove /dev/bcache0 /app

Add disk and convert to raid1

btrfs device add -f /dev/bcache0 /app
btrfs balance start -dconvert=raid1 -mconvert=raid1 /app

Check raid level

# btrfs fi df /app
Data, RAID1: total=2.69GiB, used=2.51GiB
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=317.94MiB, used=239.55MiB
GlobalReserve, single: total=12.03MiB, used=0.00B
#

If contains multiple block group profiles, could happen when a profile conversion using balance filters was interrupted.

Data, RAID1: total=2.03GiB, used=1.86GiB
Data, single: total=704.00MiB, used=665.56MiB
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=288.00MiB, used=239.56MiB
GlobalReserve, single: total=11.94MiB, used=0.00B
WARNING: Multiple block group profiles detected, see 'man btrfs(5)'.
WARNING:   Data: single, raid1

Perform rebalance again

# btrfs balance start -dconvert=raid1 -mconvert=raid1 /app
Done, had to relocate 12 out of 12 chunks

Scrub

btrfs scrub start /app
btrfs scrub status /app

Error

To correct error, first find out corrupted file, then restore from backup or delete the file

dmesg -T | grep BTRFS | grep 'check error' | grep path

Then reset error count to zero

btrfs device states -z /app

Then scrub again.

References

BTRFS-MAN(5)

ZFS Concept

ZFS Concept

Pool

ZFS pool (Zpool) is a collection of one or more virtual devices (vdevs), vdev is a group of physical disks. They have following facts.

  • The redundancy level for vdevs can be a single drive, mirror, RAID-Z1, RAID-Z2, and RAID-Z3.
  • After creating a Zpool, it may not be possible to add additional disks to the vdev except mirrors.
  • Add additional vdevs to expand the Zpool is possible.
  • The storage space allocated to the Zpool cannot be decreased.
  • The drives in vdevs that are parts of the Zpool can be exchanged.

If there is a need to change the layout of the Zpool, the data should be backed up and the Zpool destroyed.

Datasets

Datasets is the space emulating a regular file system.

Datasets can be nested, which can possess different settings for snapshots, compression, deduplication and so on.

Volumes

Volumes (zvols) is the space emulating a block devices.

Data Integrity

No overwritten

The copy-on-write mechanism is to keep old data on the disk.

Checksum

Checksum information is written when data is written into disk, then verified when read data from disk. When checksum mismatch detected, use redundant data is used for correction.

Different checksum algorithms are used

  • Fletcher-based checksum
  • SHA-256 hash

ZFS RAID

  • Single - Zpool has a vdev consisting of a single disk, similar to RAID0.
  • Mirror – similar to RAID1.
  • RAIDZ1 – similar to RAID5 but without the write hole issue.
  • RAIDZ2 – similar to RAID6, with 2 disks redundancy.
  • RAIDZ3 – similar to RAID6, with 3 disks redundancy.

RAID write hole in a RAID5/RAID1 occurs when one of the member disks doesn't match the others and by the nature of single-redundant RAID5/RAID1 it is impossible to tell which of the disks is bad.

Errors

Checksum mismatch

ZFS is a self-healing system. If mismatched checksum is detected, ZFS tries to retrieve the data from other disks. If data correct, the system will amend the incorrect data and checksum.

Disk failure

If a disk in a Zpool fails, the pool is set to the degraded state, then data on the failed device is calculated and written to first the spare disk replaces the failed one. This is called resilvering. Once the restoration operation is complete, the status of the Zpool changes back to online. In case of when multiple disks have failed and if there are not enough redundant devices, the Zpool changes its state into unavailable.

Migrate to different system

In old system, export zpool, which unmounts Zpool’s datasets or zvols.

In new system, import zpool, which mount Zpool's datasets or zvols.

Maintenance

Scrubbing

The scrubbing is consistency check operation, and try to repair corrupted data.

No defragmentation

There is no online defragmentation in ZFS, so try to keep zpools below 70% utilization instead.

Copy-on-write

On ZFS, the data changes are stored on a different location than the original location on a disk and then the metadata is updated in that place on the disk. This mechanism guarantees that the old data is safely preserved in case of power loss or system crash that in other cases would result in loss of data.

Snapshots

The snapshot contains information about the original version of the file system to be retained. Snapshots do not require additional disk space within the pool. Once the data rendered in a snapshot is modified, the snapshot will take the disk space since it will now be pointing to the old data.

Clones

The clone is a writeable version of a snapshot. Overwriting the blocks in the cloned volume or file system results in decrementing the reference count on the previous block. The original snapshot that the clone is depending on, can not be deleted.

Rollback

Rollback command is to go back to a previous version of a dataset or a volume. Note that the rollback command cannot revert changes from other snapshots than the most recent one. If to do so, all intermediate snapshots will be automatically destroyed.

Promote

Promote command is to replace an existing volume with its clone.

References

ZFS Essentials – What is pooled storage?
ZFS Essentials – Copy-on-write & snapshots
ZFS Essentials – Data integrity & RAIDZ
RAID Recovery Guide