Category: bridge

Firewalld conflict between Docker and KVM

Firewalld conflict between Docker and KVM

After install docker, KVM bridge network can not access anything on network.

Identify

To identify the issue came from firewall and created by docker, the following facts had been collected.

  • After rebooted server, VM can access network, and restart firewalld without issue
  • After start docker service, VM can not access network any more
  • Then VM can access network after stop firewalld, but docker can not start container, because iptables is not accessible

Issue

No matter how to change iptables rules, and accept all traffics from everywhere, but VM was still isolated.

Commands used

Following commands were used for troubleshooting

Firewalld

In fact, there is no chain, rule, or passthroughs in firewall-cmd output. But after stop firewalld, the iptables rules became empty.

systemctl restart firewalld
firewall-cmd --list-all
firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -i bridge0 -j ACCEPT
firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -o bridge0 -j ACCEPT
firewall-cmd --reload

firewall-cmd --permanent --direct --get-all-chains
firewall-cmd --permanent --direct --get-all-rules
firewall-cmd --permanent --direct --get-all-passthroughs
firewall-cmd --permanent --direct --remove-passthrough ipv4 -I FORWARD -o bridge0 -j ACCEPT

firewall-cmd --get-default-zone
firewall-cmd --get-active-zone
firewall-cmd --get-zones
firewall-cmd --get-services
firewall-cmd --list-all-zones

iptables

iptables -L -v
iptables -L -v FORWARD
iptables -I FORWARD -i br0 -o br0 -j ACCEPT
iptables -I FORWARD -j ACCEPT
iptables -I FORWARD 1 -j ACCEPT
iptables -d FORWARD 1
iptables-save
iptables-restore

others

Following commands are used to collect info and compare the differences between before and after.

brctl-show
ip a
netstat -rn

Potential issues

Following possiblities caused this issue or wrong troubleshooting

  • The iptables might not be used in the system, but the counters are refreshing.
  • Some rules in intables might not appearred in the iptables list

Debugging

For firewald, FIREWALLD_ARGS=--debug needs to be added into /etc/sysconfig/firewalld.

For iptables, -j LOG --log-prefix "rule description" needs to be added into iptables rules which require debugging.

Suggestions from others

Add ACCEPT rules

Run following commands to add ACCEPT rules

#!/bin/sh

# If I put bridge0 in trusted zone then firewalld allows anything from 
# bridge0 on both INPUT and FORWARD chains !
# So, I've put bridge0 back into the default public zone, and this script 
# adds rules to allow anything to and from bridge0 to be FORWARDed but not INPUT.

BRIDGE=bridge0
iptables -I FORWARD -i $BRIDGE -j ACCEPT
iptables -I FORWARD -o $BRIDGE -j ACCEPT

Conclusion

After many testings, found that docker is directly adding rules into iptables, not go thru firewalld. This can be noticed using following steps.

  1. Stop both firewalld and docker, iptables has no rules
  2. Start docker, iptables has only docker's rules
  3. Start filewalld, in short period time, LIBVIRT rules appear, after seconds, replaced by docker rules

Another testing

  1. Stop both firewalld and docker again
  2. Start firewalld, only the LIBVIRT rules appear
  3. Start docker, both docker and LIBVIRT rules appear

One issue was facing during reboot, if both docker and firewalld are enabled, the server might hung during reboot, maybe this is because root filesystem is on iSCSI disk, but can not confirm.

Above behaivor shows iptables is not supporting firewalld, which directly inserts rules into iptables periodically, which corrupts firewalld rules.

Solution

Run script

This solution disables firewalld and enable docker

systemctl disable firewalld
systemctl enable docker

Then run following command to add iptables rules to enable traffics

iptables -I FORWARD -i br0 -j ACCEPT
iptables -I FORWARD -o br0 -j ACCEPT

This script can be put in /etc/rc.local, which will be executed when during boot up.

Install iptables services

This solution also disables firewalld and enable docker as previous solution, then add two FORWARD rules into default iptables rules /etc/sysconfig/iptablesas below.

# sample configuration for iptables service
# you can edit this manually or use system-config-firewall
# please do not ask us to add additional ports/services to this default configuration
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
-A FORWARD -o br0 -j ACCEPT
-A FORWARD -i br0 -j ACCEPT
:OUTPUT ACCEPT [0:0]
#-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
#-A INPUT -p icmp -j ACCEPT
#-A INPUT -i lo -j ACCEPT
#-A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
#-A INPUT -j REJECT --reject-with icmp-host-prohibited
#-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT

Then both LIBVIRT and docker will add their rules later after system started.

Modify firewalld rules

For this solution, failed last time, I will try it again later.

firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -i bridge0 -j ACCEPT
firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -o bridge0 -j ACCEPT

Feature

If possible, define firewalld rules which cover both LIBVIRT and docker.

References

Configure FirewallD to allow bridged virtual machine network access
Debug firewalld
How to configure iptables on CentOS

Less related topic
Do I need to restore iptable rules everytime on boot?
need iptables rule to accept all incoming traffic

Bridge Interface vs Macvtap Interface in TrueNAS

Bridge Interface vs Macvtap Interface in TrueNAS

More clear information can be found in references.

Description

Note: This is based on my understanding, might be incorrect.

Bridge and macvtap both create a network interface on physical network, which is used by VMs.

One VM uses one dedicated Macvtap interface on host with same MAC address. Macvtap is a network interface on macvlan.

VMs on bridge share same bridge interface which has different MAC address in the host.

Bridge Mode

Virtual interfaces in VMs => Bridge interface => Physical Interface in Host

Macvlan

Macvtap interfaces => Physical Interface in Host

Pros

Macvtap

  • Macvtap interfaces in host, can tap on different physical interface, changing to different physical interface is done in host.
  • Passthru, VM uses same interface as host created
  • If the VM is MAC address sensitive, then should use macvtap

Bridge

  • VM and host can communicate to each other
  • VM can use host services
  • Bridge can be created without physical interface

Cons

Macvtap

  • VM can not communicated with host

Bridge

  • VMs' virtual interfaces use same bridge interface in host

Sample

Macvtap

truenas# ifconfig -a
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
        inet 192.168.1.19  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 06:b1:f7:6d:13:4c  txqueuelen 1000  (Ethernet)
        RX packets 1680348527  bytes 2208822464277 (2.0 TiB)
        RX errors 0  dropped 151  overruns 0  frame 0
        TX packets 1617739524  bytes 1698187389538 (1.5 TiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp17s0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 54:04:a6:4b:81:c8  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 5107071  bytes 2866624273 (2.6 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5107071  bytes 2866624273 (2.6 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

macvtap11: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::2a0:98ff:fe78:393  prefixlen 64  scopeid 0x20<link>
        ether 00:a0:98:78:03:93  txqueuelen 500  (Ethernet)
        RX packets 22627262  bytes 78234456341 (72.8 GiB)
        RX errors 2324  dropped 2324  overruns 0  frame 0
        TX packets 14142613  bytes 71245696317 (66.3 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

macvtap12: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::2a0:98ff:fe1f:d5c7  prefixlen 64  scopeid 0x20<link>
        ether 00:a0:98:1f:d5:c7  txqueuelen 500  (Ethernet)
        RX packets 480  bytes 943563 (921.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 301  bytes 36435 (35.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth7ae8af79: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::80de:6cff:fe6c:3ac2  prefixlen 64  scopeid 0x20<link>
        ether b2:16:c9:7a:5d:51  txqueuelen 0  (Ethernet)
        RX packets 1049172  bytes 819582891 (781.6 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1006473  bytes 532661621 (507.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vethe3db1df9: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::1001:84ff:feaf:f3c3  prefixlen 64  scopeid 0x20<link>
        ether 5e:93:0b:01:a8:b0  txqueuelen 0  (Ethernet)
        RX packets 818421  bytes 79271857 (75.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 858581  bytes 75468204 (71.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlp15s0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 00:08:ca:28:b8:d1  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Bridge

truenas# ifconfig -a
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
        ether 06:b1:f7:6d:13:4c  txqueuelen 1000  (Ethernet)
        RX packets 2273930881  bytes 3403517718704 (3.0 TiB)
        RX errors 0  dropped 4615  overruns 0  frame 0
        TX packets 417927732  bytes 27440289291 (25.5 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.19  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 06:62:23:59:d5:35  txqueuelen 1000  (Ethernet)
        RX packets 628064711  bytes 1811199459942 (1.6 TiB)
        RX errors 0  dropped 2  overruns 0  frame 0
        TX packets 331791210  bytes 1156251281548 (1.0 TiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp17s0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 54:04:a6:4b:81:c8  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 2676343  bytes 1435559819 (1.3 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2676343  bytes 1435559819 (1.3 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth8dfea17d: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::780d:4cff:fe3c:a108  prefixlen 64  scopeid 0x20<link>
        ether ae:fa:aa:94:e6:3f  txqueuelen 0  (Ethernet)
        RX packets 432268  bytes 41857026 (39.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 437864  bytes 38739338 (36.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vethc44c60e0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::dc33:23ff:fef3:7f06  prefixlen 64  scopeid 0x20<link>
        ether 8a:af:5b:12:51:36  txqueuelen 0  (Ethernet)
        RX packets 549957  bytes 433379151 (413.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 480138  bytes 272756471 (260.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vnet0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::fca0:98ff:fe78:393  prefixlen 64  scopeid 0x20<link>
        ether fe:a0:98:78:03:93  txqueuelen 1000  (Ethernet)
        RX packets 79558657  bytes 253665109225 (236.2 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 221506436  bytes 2888491048856 (2.6 TiB)
        TX errors 0  dropped 1220 overruns 0  carrier 0  collisions 0

wlp15s0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 00:08:ca:28:b8:d1  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

References

Bridge vs Macvlan
Enabling host-guest networking with KVM, Macvlan and Macvtap

KVM setup in Fedora

KVM setup in Fedora

Commands

virsh list --all
virsh start <vm>
virsh start <vm> --console
virsh stop <vm>

KVM vs XEN

KVM isn't kernel specific, XEN required special kernel, so XEN could have kernel upgrade issue.

Bridge Network

When creating bridging network, if grub is used to create network interface, then Network Manager should not be used to create same interface. If Network Manager used, same network interface will be appear in ifconfig -a command output twice, One is created by NetworkManager, another is created by grub. If bridge network interface created on top of grub created interface, the IP address will be still assigned to grub created interface.

In order to avoid above issue, following line in /etc/default/grub to create network interface with bridging network interface br0.

GRUB_CMDLINE_LINUX=" ... ip=192.168.1.9::192.168.1.254:255.255.255.0::br0:off nameserver=192.168.1.250 ifname=enp0s10:00:26:4a:18:82:c6 bridge=br0:enp0s10"

After br0 created, KVM manager can select bridging network for vm creation.

Update grub using following command

grub2-mkconfig -o /boot/grub2/grub.cfg

Download driver

Both Windows disk controller driver and ethernet driver can be downloaded from Fedora Website, https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/archive-virtio/virtio-win-0.1.139-1/virtio-win-0.1.139.iso, and add additional CD-ROM to point to this iso.

Create VM

Using Virtual Machine Manager

Create VM requires add storage, if the storage file doesn't exist, need to select the storage location, and also input the size of disk which located above the location selection box.

Using command line

To create Ubuntu VM from local image,

virt-install \
--name ubuntu2104 \
--ram 3072 \
--vcpus 2 \
--disk path=/kvm/ubuntu2104.qcow2,size=20 \
--os-variant ubuntu20.04 \
--os-type linux \
--network bridge=br0 \
--graphics none \
--console pty,target_type=serial \
--cdrom /kvm/ubuntu-21.04-live-server-amd64.iso \
--boot kernel=casper/vmlinuz,initrd=casper/initrd,kernel_args="console=ttyS0"

To create Fedora VM from remote server

virt-install \
--name fed34 \
--ram 2048 \
--vcpus 2 \
--disk path=/kvm/fed34.img,size=20 \
--os-variant fedora34 \
--os-type linux \
--network bridge=virbr0 \
--graphics none \
--console pty,target_type=serial \
--location 'https://mirror.arizona.edu/fedora/linux/releases/34/Server/x86_64/os/' \
--extra-args 'console=ttyS0,115200n8 serial'

Create Windows 10 VM

virt-install \
   --ram=4096 \
   --name=windows10 \
   --os-type=win10 \
   --network network=default \
   --disk path=/kvm/kvm-windows10.img,size=100 \
   --cdrom=/kvm/virtio-win-0.1.139.iso \
   --graphics spice

Cons

  • Cannot select type of CPU or Passthru mode
  • Cannot select type of disk controller type to use virtual device driver.

References

10 Easy Steps To Install Windows 10 on Linux KVM – KVM Windows

Boot from small USB drive with iscsi root filesystem

Boot from small USB drive with iscsi root filesystem

Boot from small size USB drive only holding boot partitions, rest of filesystems are on iscsi drives. Tested in EFI boot in Fedora 34.

Requirement

  • /boot partition can be 256M, can be very small, but better bigger
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sdb2                          428M  190M  212M  48% /boot
  • /boot/efi is an almost static very small filesystem, can be very small
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sdb1                          512M   31M  482M   6% /boot/efi

grub configuration

Define iscsi login info

GRUB_CMDLINE_LINUX="netroot=iscsi:<user>:<password>@<ip>::3260::<iqn> rd.iscsi.initiator=<client iqn> rhgb quiet ...

Define network interface with static ip 192.168.1.2, gateway 192.168.1.254, nameserver 192.168.1.1, interface enp0s10.

ip=192.168.1.2::192.168.1.254:255.255.255.0::enp0s10:off nameserver=192.168.1.1

Define network with bridge interface br0 on network interface enp0s10

ip=192.168.1.2::192.168.1.254:255.255.255.0::br0:off nameserver=192.168.1.1 ifname=enp0s10:xx:xx:xx:xx:xx:xx bridge=br0:enp0s10"

Update grub using following command

grub2-mkconfig -o /boot/grub2/grub.cfg