ZFS cache and log

Posted on 25/10/202125/10/2021 by Bian Xi Leave a Comment

Table of Contents

ZFS cache and log

There are two kinds of cache, read cache and write cache.

Read cache

Called ARC and L2ARC.

ARC (Adaptive Replacement Cache)

In memory, caching the information that would require in the near future, while discarding the ones that will be needed furthest ahead in time.

This can be set using kernel/module parameter, such as zfs_arc_max.

L2ARC (Level 2 ARC)

In cache device, extension of ARC. Can be created using following command

zpool add tank cache ada3

Note: tank is the pool name, ada3 is the block device used for caching

Write cache

Called ZIL (ZFS Intent Log).

Asynchronous

By default, ZFS will cache write data in memory before write to disk, this is called asynchronous mode.

Synchronous

Synchronous will make sure data written to disk before continue, this can be set using following command

zfs set sync=always mypool/dataset1

ZFS Intent Log (ZIL)

This is the temporary space to store data before write into main disks, this can be used to speed up write operation. The write operation is considered as completed once data written into ZIL device, which is called SLOG (Separate Intent Log) devices, can be defined as follow

zpool add tank log ada3

Note: tank is the pool name, ada3 is the block device used for slog

If worrying SLOG device faulty, it can be mirrored too.

zpool add tank log mirror ada3 ada4

References

Configuring ZFS Cache for High Speed IO
ZFS Performance with Databases (Cached)

Snapshot and Copy on Write

Posted on 25/10/202125/10/2021 by Bian Xi Leave a Comment

Table of Contents

Snapshot and Copy on Write

Two type of snapshots, they are using different ways.

Keep original data

No write to original data, all new data will be in delta file.

VMware

All new data will be in delta file, the original disk file will not be changed. This is very usefull especially in VDI environment, all VDI servers will base on same images and no impact to original disk file.

When deleting a snapshot, the snapshot files are consolidated and written to the parent snapshot disk. If parent is base disk, and all the data from the delta disk will be merged with the virtual machine base disk.

QEMU / KVM: COW mode

COW mode is available on some formats of virtual machine disk as QCOW2. When using the COW mode, no changes are applied to the disk image. All changes are recorded in a separate file preserving the original image. Several COW files can point to the same image to test several configurations simultaneously without jeopardizing the basic system.

QEMU / KVM allows to incorporate changes from a COW file to the original image

Overwrite original disk

Write latest data into original disk, and the original data move to delta disk.

RedHat LVM snapshot

The data in snapshot volume is original data.

ZFS and btrfs

Due to native copy-on-write feature, file usage reference structure always points to new data, and the old data is saved in old reference structure.

Compare

Original Disk	Pros	Cons
Keep	* Support multiple childs without too much performance overhead	Deleting snapshot takes time When disk full, no more write can be done
Overwrite	Less overhead - only when first time writing data on new location Reverting snapshot takes time	* Deleting snapshot fast
Native COW	No overhead Fast dropping snapshot Fast reverting snapshot No impact to service when snapshot full, but snapshot corrupt	Unable to delete file when disk full Disk fragmented easily * Unable to cache disk write operation

References

Deleting Snapshots
QEMU / KVM: Using the Copy-On-Write mode
Why would I want to disable Copy-On-Write while creating QEMU Images?

Error of txg_sync blocked for more than 120 seconds

Posted on 24/10/202124/10/2021 by Bian Xi Leave a Comment

Error of txg_sync blocked for more than 120 seconds

Following error was appearing in my dmesg monitoring screen.

txg_sync blocked for more than 120 seconds --> excessive load

If I'm not wrong, it could be caused by slow harddisk speed, because the TrueNAS zfs cache is about 61GB, can take longer time to flush back to hard disk.

Same as other filesystem, zfs has writeback caching (aka write-behind caching), which will flush data back to hard disk in specific interval. zfs has synchronous and asynchronous mode, they are a bit different that readonly, writethrough and writeback mode.

Except above, zfs has different behaviors on copy on write (COW) as below.

Always write to new block due to copy on write
Big file for random writing, such as VM disk file, can be fragmented
Can not reduce the write operation even if keep writing same block

Therefore, copy on write should be disabled for VM images. But if so, snapshot function could be lost.

Reference

Read-Through, Write-Through, Write-Behind Caching and Refresh-Ahead

Ubuntu grub-efi-amd64-signed error after do-release-upgrade

Posted on 24/10/2021 by Bian Xi Leave a Comment

Ubuntu grub-efi-amd64-signed error after do-release-upgrade

Following error occurred whenever run apt upgrade after perform do-release-upgrade

dpkg: error processing package grub-efi-amd64-signed (–configure):
installed grub-efi-amd64-signed package post-installation script subprocess returned error exit status 32

Solution

Reinstall all grub group packages using following commands

sudo apt-get purge grub\*
sudo apt-get install grub-efi
sudo apt-get autoremove
sudo update-grub

Options restrict in one filesystem

Posted on 23/10/2021 by Bian Xi Leave a Comment

Options restrict in one filesystem

There are quite number of tasks may want to be executed in one filesystem, this is important during troubleshooting, especially for root directory (/).

find

Restrict find command only looking entries within one filesystem, use option -xdev

find /usr -xdev ...

du

Restrict du command only calculate for one filesystem, use option -x

du -cshx /

tar

Restrict tar command only archive files in one filesystem, use option --one-file-system

tar --one-file-system -czvf /tmp/root.tgz /

Increase upload file size limit for WordPress and NGNIX

Posted on 22/10/202122/10/2021 by Bian Xi Leave a Comment

Table of Contents

Increase upload file size limit for WordPress and NGNIX

There are various ways to do, but the workable way is, updating .htaccess in WordPress and NGNIX configuration file.

Issue

First, tried the way by changing function.php in theme, but no luck. Then updated .htaccess file, it worked.

Then the client gets the error “Request Entity Too Large” (413). This error reported by NGINX.

WordPress

Add following lines in .htaccess file in html directory

php_value upload_max_filesize 64M
php_value post_max_size 64M
php_value max_execution_time 300
php_value max_input_time 300

Then the upload page in WordPress should be shown as below

Maximum upload file size: 64 MB.

Alternative

These options are PHP options, which can be applied to php.ini as well as below

upload_max_filesize = 64M
post_max_size = 64M
max_execution_time = 300

NGINX

Add the following line to http, server or location context in nginx.conf or conf.d/default.conf

client_max_body_size 64M;

Then reload NGINX configure.

# /usr/local/nginx/sbin/nginx -s reload

This will fix the client error “Request Entity Too Large” (413).

Remove ubuntu zfs snapshots

Posted on 22/10/202122/10/2021 by Bian Xi Leave a Comment

Remove ubuntu zfs snapshots

There are so many snapshots when using zfs in ubuntu.

Issue

When tried to do release update, got following error

# do-release-update
...
...
Not enough free disk space 

The upgrade has aborted. The upgrade needs a total of 256 M free 
space on disk '/boot'. Please free at least an additional 91.4 M of 
disk space on '/boot'. You can remove old kernels using 'sudo apt 
autoremove' and you could also set COMPRESS=xz in 
/etc/initramfs-tools/initramfs.conf to reduce the size of your 
initramfs. 
...

This error messsage was occurred many times before, but those systems had very small /boot partition or many old kernels kept. If it is the first case, total repartitioning and moving root filesystem are required.

Space on /boot

Examing disk space for bpool, found that zfs reported 675MB used in bpool, but actual usage is only 242MB.

root@ubuntu:~# zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
bpool   960M   675M   285M        -         -    30%    70%  1.00x    ONLINE  -
rpool  17.5G  7.99G  9.51G        -         -    21%    45%  1.00x    ONLINE  -
root@ubuntu:~# zfs list bpool
NAME    USED  AVAIL     REFER  MOUNTPOINT
bpool   675M   157M       96K  /boot
root@ubuntu:~# du -cshx /boot
242M    /boot
242M    total
root@ubuntu:~#

Then found many snapshots both in bpool and data pool

root@ubuntu:~# zfs list -t snapshot | head
NAME                                                               USED  AVAIL     REFER  MOUNTPOINT
bpool/BOOT/ubuntu_e8m8h0@autozsys_ywm1ok                             0B      -      238M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_ms74md                             0B      -      238M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_ugu9z7                            80K      -      242M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_r3xqau                            72K      -      242M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_nkagbh                             0B      -      242M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_xdbwsy                             0B      -      242M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_zrt7vi                            72K      -      242M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_jbmnwk                            72K      -      242M  -
bpool/BOOT/ubuntu_e8m8h0@autozsys_0e5p2e                            64K      -      242M  -
root@ubuntu:~# 
root@ubuntu:~# zfs list -t snapshot | wc
    301    1505   27701

Too many! Not sure how many snapshots ubuntu likes to create

Removing snapshots

List all snapshots for /boot

root@ubuntu:~# df /boot
Filesystem               1K-blocks   Used Available Use% Mounted on
bpool/BOOT/ubuntu_e8m8h0    408192 247808    160384  61% /boot
root@ubuntu:~# zfs list -H -o name -t snapshot bpool/BOOT/ubuntu_e8m8h0
bpool/BOOT/ubuntu_e8m8h0@autozsys_ywm1ok
bpool/BOOT/ubuntu_e8m8h0@autozsys_ms74md
bpool/BOOT/ubuntu_e8m8h0@autozsys_ugu9z7
bpool/BOOT/ubuntu_e8m8h0@autozsys_r3xqau
bpool/BOOT/ubuntu_e8m8h0@autozsys_nkagbh
bpool/BOOT/ubuntu_e8m8h0@autozsys_xdbwsy
bpool/BOOT/ubuntu_e8m8h0@autozsys_zrt7vi
bpool/BOOT/ubuntu_e8m8h0@autozsys_jbmnwk
bpool/BOOT/ubuntu_e8m8h0@autozsys_0e5p2e
bpool/BOOT/ubuntu_e8m8h0@autozsys_b17dwn
bpool/BOOT/ubuntu_e8m8h0@autozsys_uad1rb
bpool/BOOT/ubuntu_e8m8h0@autozsys_mxhvc9
bpool/BOOT/ubuntu_e8m8h0@autozsys_9athz8
bpool/BOOT/ubuntu_e8m8h0@autozsys_61umv1
bpool/BOOT/ubuntu_e8m8h0@autozsys_1q65cz
root@ubuntu:~#

Then remove them

zfs list -H -o name -t snapshot bpool/BOOT/ubuntu_e8m8h0 | xargs -n 1 zfs destroy

Now, it is ok to upgrade

root@ubuntu:~# zfs list -o space bpool
NAME   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
bpool   589M   243M        0B     96K             0B       243M
root@ubuntu:~#

Firewalld Basic

Posted on 21/10/2021 by Bian Xi Leave a Comment

Table of Contents

Firewalld Basic

Concept

Some basic concepts for firewalld to understand the commands

NIC
Different NIC can have different zone assigned using nmcli command, if not specified, it is using default zone.
Zone
By default, default zone is called default, this can be changed using firewalld command temporarily.
To assign the default zoon to the zone that isn't named default, using nmcli command is required.
Service
Port

Start/Stop

# systemctl start firewalld
# systemctl enable firewalld

Default zone

Default zone is public when option --zone is not specified in command line.

Display the default zone

# firewall-cmd --get-default-zone
public

Display current settings

# firewall-cmd --list-all
public (default, active)
  interfaces: eno16777736
  sources:
  services: dhcpv6-client ssh
  ports:
  masquerade: no
  forward-ports:
  icmp-blocks:
  rich rules:

Display all zones defined by default

# firewall-cmd --list-all-zones
block
  interfaces:
  sources:
  services:
  ports:
  masquerade: no
  forward-ports:
  icmp-blocks:
  rich rules:
  .....
  .....

Display allowed services on a specific zone

# firewall-cmd --list-service --zone=external
ssh

Change default zone

# firewall-cmd --set-default-zone=external
success

Change zone for an interface

Note: it's not changed permanently with "change-interface" even if added "--permanent" option

# firewall-cmd --change-interface=eth1 --zone=external
success
# firewall-cmd --list-all --zone=external
external (active)
  interfaces: eth1
  sources:
  services: ssh
  ports:
  masquerade: yes
  forward-ports:
  icmp-blocks:
  rich rules:

To change permanently, use nmcli like follows

# nmcli c mod eth1 connection.zone external
# firewall-cmd --get-active-zone
external
  interfaces: eth1
public
  interfaces: eth0

Services

Display services

# firewall-cmd --get-services
amanda-client bacula bacula-client dhcp dhcpv6 dhcpv6-client dns ftp high-availability http https imaps ipp ipp-client ipsec kerberos kpasswd ldap ldaps libvirt libvirt-tls mdns mountd ms-wbt mysql nfs ntp openvpn pmcd pmproxy pmwebapi pmwebapis pop3s postgresql proxy-dhcp radius rpc-bind samba samba-client smtp ssh telnet tftp tftp-client transmission-client vnc-server wbem-https

Service definition files are XML files in /usr/lib/firewalld/services

# ls /usr/lib/firewalld/services
amanda-client.xml      ipp-client.xml   mysql.xml       rpc-bind.xml
bacula-client.xml      ipp.xml          nfs.xml         samba-client.xml
bacula.xml             ipsec.xml        ntp.xml         samba.xml
dhcpv6-client.xml      kerberos.xml     openvpn.xml     smtp.xml
dhcpv6.xml             kpasswd.xml      pmcd.xml        ssh.xml
dhcp.xml               ldaps.xml        pmproxy.xml     telnet.xml
dns.xml                ldap.xml         pmwebapis.xml   tftp-client.xml
ftp.xml                libvirt-tls.xml  pmwebapi.xml    tftp.xml
high-availability.xml  libvirt.xml      pop3s.xml       transmission-client.xml
https.xml              mdns.xml         postgresql.xml  vnc-server.xml
http.xml               mountd.xml       proxy-dhcp.xml  wbem-https.xml
imaps.xml              ms-wbt.xml       radius.xml

Add or remove services temporarily.

# firewall-cmd --add-service=http
success
# firewall-cmd --list-service
dhcpv6-client http ssh
...
...
# firewall-cmd --remove-service=http
success
# firewall-cmd --list-service
dhcpv6-client ssh

Add or remove services permanently

Note: Reload the Firewalld is required to enable the change

# firewall-cmd --add-service=http --permanent
success
# firewall-cmd --reload
success
# firewall-cmd --list-service
dhcpv6-client http ssh

Ports

Add or remove ports temporarily.

# firewall-cmd --add-port=465/tcp
success
# firewall-cmd --list-port
465/tcp
# firewall-cmd --remove-port=465/tcp
success
# firewall-cmd --list-port

Add or remove ports permanently

# firewall-cmd --add-port=465/tcp --permanent
success
# firewall-cmd --reload
success
# firewall-cmd --list-port
465/tcp

ICMP

Add or remove ICMP types.

# firewall-cmd --add-icmp-block=echo-request
success
# firewall-cmd --list-icmp-blocks
echo-request
# firewall-cmd --remove-icmp-block=echo-request
success
# firewall-cmd --list-icmp-blocks

Display ICMP types

# firewall-cmd --get-icmptypes
destination-unreachable echo-reply echo-request parameter-problem redirect
router-advertisement router-solicitation source-quench time-exceeded

References

Firewalld : Basic Operation

Troubleshooting ping drop packet with same interval

Posted on 27/09/202120/10/2021 by Bian Xi Leave a Comment

Troubleshooting ping drop packet wit same interval

The issue appear between 10G Qnap switch and the TPlink router. TPLink has a 2.5GB ethernet, which connects to 10G ethernet of Qnap switch. Sometimes, ping drop package, they have almost same interval!

% ping 192.168.1.254 
PING 192.168.1.254 (192.168.1.254): 56 data bytes
64 bytes from 192.168.1.254: icmp_seq=0 ttl=64 time=0.464 ms
Request timeout for icmp_seq 1
64 bytes from 192.168.1.254: icmp_seq=2 ttl=64 time=0.431 ms
64 bytes from 192.168.1.254: icmp_seq=3 ttl=64 time=0.399 ms
64 bytes from 192.168.1.254: icmp_seq=4 ttl=64 time=0.302 ms
64 bytes from 192.168.1.254: icmp_seq=5 ttl=64 time=0.356 ms
64 bytes from 192.168.1.254: icmp_seq=6 ttl=64 time=0.461 ms
64 bytes from 192.168.1.254: icmp_seq=7 ttl=64 time=0.495 ms
64 bytes from 192.168.1.254: icmp_seq=8 ttl=64 time=0.450 ms
64 bytes from 192.168.1.254: icmp_seq=9 ttl=64 time=0.573 ms
64 bytes from 192.168.1.254: icmp_seq=10 ttl=64 time=0.282 ms
64 bytes from 192.168.1.254: icmp_seq=11 ttl=64 time=0.374 ms
64 bytes from 192.168.1.254: icmp_seq=12 ttl=64 time=0.604 ms
64 bytes from 192.168.1.254: icmp_seq=13 ttl=64 time=0.438 ms
64 bytes from 192.168.1.254: icmp_seq=14 ttl=64 time=0.418 ms
64 bytes from 192.168.1.254: icmp_seq=15 ttl=64 time=0.446 ms
64 bytes from 192.168.1.254: icmp_seq=16 ttl=64 time=0.570 ms
64 bytes from 192.168.1.254: icmp_seq=17 ttl=64 time=0.753 ms
64 bytes from 192.168.1.254: icmp_seq=18 ttl=64 time=0.456 ms
64 bytes from 192.168.1.254: icmp_seq=19 ttl=64 time=0.530 ms
64 bytes from 192.168.1.254: icmp_seq=20 ttl=64 time=0.531 ms
64 bytes from 192.168.1.254: icmp_seq=21 ttl=64 time=0.480 ms
64 bytes from 192.168.1.254: icmp_seq=22 ttl=64 time=0.498 ms
64 bytes from 192.168.1.254: icmp_seq=23 ttl=64 time=0.498 ms
64 bytes from 192.168.1.254: icmp_seq=24 ttl=64 time=0.465 ms
Request timeout for icmp_seq 25
64 bytes from 192.168.1.254: icmp_seq=26 ttl=64 time=0.493 ms
64 bytes from 192.168.1.254: icmp_seq=27 ttl=64 time=0.520 ms
64 bytes from 192.168.1.254: icmp_seq=28 ttl=64 time=0.462 ms
64 bytes from 192.168.1.254: icmp_seq=29 ttl=64 time=0.459 ms
64 bytes from 192.168.1.254: icmp_seq=30 ttl=64 time=0.535 ms
64 bytes from 192.168.1.254: icmp_seq=31 ttl=64 time=0.468 ms
64 bytes from 192.168.1.254: icmp_seq=32 ttl=64 time=0.505 ms
64 bytes from 192.168.1.254: icmp_seq=33 ttl=64 time=0.539 ms
64 bytes from 192.168.1.254: icmp_seq=34 ttl=64 time=0.515 ms
64 bytes from 192.168.1.254: icmp_seq=35 ttl=64 time=0.504 ms
64 bytes from 192.168.1.254: icmp_seq=36 ttl=64 time=0.519 ms
64 bytes from 192.168.1.254: icmp_seq=37 ttl=64 time=0.415 ms
64 bytes from 192.168.1.254: icmp_seq=38 ttl=64 time=0.415 ms
64 bytes from 192.168.1.254: icmp_seq=39 ttl=64 time=0.384 ms
64 bytes from 192.168.1.254: icmp_seq=40 ttl=64 time=0.443 ms
64 bytes from 192.168.1.254: icmp_seq=41 ttl=64 time=0.456 ms
64 bytes from 192.168.1.254: icmp_seq=42 ttl=64 time=0.349 ms
64 bytes from 192.168.1.254: icmp_seq=43 ttl=64 time=0.345 ms
64 bytes from 192.168.1.254: icmp_seq=44 ttl=64 time=0.272 ms
64 bytes from 192.168.1.254: icmp_seq=45 ttl=64 time=0.456 ms
64 bytes from 192.168.1.254: icmp_seq=46 ttl=64 time=0.523 ms
64 bytes from 192.168.1.254: icmp_seq=47 ttl=64 time=0.553 ms
64 bytes from 192.168.1.254: icmp_seq=48 ttl=64 time=0.389 ms
Request timeout for icmp_seq 49
64 bytes from 192.168.1.254: icmp_seq=50 ttl=64 time=0.417 ms
64 bytes from 192.168.1.254: icmp_seq=51 ttl=64 time=0.433 ms
64 bytes from 192.168.1.254: icmp_seq=52 ttl=64 time=0.467 ms
64 bytes from 192.168.1.254: icmp_seq=53 ttl=64 time=0.417 ms
^C
--- 192.168.1.254 ping statistics ---
54 packets transmitted, 51 packets received, 5.6% packet loss
round-trip min/avg/max/stddev = 0.272/0.461/0.753/0.083 ms
%

Possible issue

After a month, I found that in Qnap web console, the flow control on the switch port, always flicking, sometimes enable, sometimes disable. Due to this behavior, I think could be the issue with the connection between them could try to re-established again and again.

Then I disabled flow-control from switch side, because I can not find the port settings in TPlink router.

Flow control

Enable flow control is to reduce packet dropping, but auto-negotiate can cause issue. Most of time both ends of ethernet can leave to auto-negotiate, but prefer to set one side manual if possible, especially two side has different highest speed.

References

Flow Control

Network filesystem timeout settings

Posted on 18/10/202120/10/2021 by Bian Xi Leave a Comment

Table of Contents

Network filesystem timeout settings

Network disruptions are always happening, network filesystems on different OS have different behaviors.

NFS

During Synology disk migration and SSD cache reconfiguration, my Fedora 34 on iSCSI mounted NFS disk kept hanging, I checked the default NFS mount options, then found that it was using hard option with out intr as below,

192.168.1.10:/volume1/kvm on /kvm type nfs4 (rw,nosuid,nodev,noexec,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.9,local_lock=none,addr=192.168.1.10)

I think maybe this is giving the factor of hanging.

iSCSI

After I changed NFS setting to soft, I suddenly realized that my iSCSI used by Fedora OS might not able to handle interupt as well, not sure whether iSCSI got similar options.

Samba on MacOS

My MacOS also got issue on samba filesystem, always disconnected after communication dropped, but my Windows machine has no such issue.

References

What are the differences between hard mount and soft mount?

Blog

ZFS cache and log

Read cache

ARC (Adaptive Replacement Cache)

L2ARC (Level 2 ARC)

Write cache

Asynchronous

Synchronous

ZFS Intent Log (ZIL)

References

Snapshot and Copy on Write

Keep original data

VMware

QEMU / KVM: COW mode

Overwrite original disk

RedHat LVM snapshot

ZFS and btrfs

Compare

References

Error of txg_sync blocked for more than 120 seconds

Reference

Ubuntu grub-efi-amd64-signed error after do-release-upgrade

Solution

Options restrict in one filesystem

find

du

tar

Increase upload file size limit for WordPress and NGNIX

Issue

WordPress

Alternative

NGINX

Remove ubuntu zfs snapshots

Issue

Space on /boot

Removing snapshots

Firewalld Basic

Concept

Start/Stop

Default zone

Services

Ports

ICMP

References

Troubleshooting ping drop packet wit same interval

Possible issue

Flow control

References

Network filesystem timeout settings

NFS

iSCSI

Samba on MacOS

References