Blog
Reinstall Proxmox VE node in cluster
Table of Contents
Reinstall Proxmox VE node in cluster
After node pve01 in Proxmox VE cluster crushed, reinstall new pve01 in same hardware.
Install PVE using ISO
This just follows the normal installation steps.
Try and error
Tried many tries, end up using following steps to add the replacement node.
- In the any old node, which is not the node itself, run following to del the node from cluster
pvecm delnode <old_node>
- Remove old node known host from all other nodes
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "<old_node>"
ssh-keygen -f "/root/.ssh/known_hosts" -R "<old_node>"
- In the new node, run
pvecm add <existing_node>
pvecm updatecerts
- Update vote for new node (optional)
Edit file /etc/pve/corosync.conf
change the vote number.
- Import old local pools
zpool import -f <old_local_pool>
Change expected votes
Run following commands to check and set acceptable votes in existing node in the cluster
pvecm status
pvecm expected 3
Remove old node
pvecm delnode pve01
Remove old ssh know host
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "pve01"
ssh-keygen -f "/root/.ssh/known_hosts" -R "pve01"
or manual edit two files.
Add node
Run following command in NEW NODE
pvecm add <existing_node>
Sync certs
pvecm updatecerts
Test SSH key authentication
Make sure SSH Key authentication is working
Copy UI certificate
cp /etc/pve/nodes/pve02/pveproxy-ssl.* /etc/pve/nodes/pve01
Remove local-zfs
filesystem
If the previous node was using zfs, then now change to ext4, local-zfs
needs to be removed.
vi /etc/pve/storage.cfg
If need to disable cluster, following command can be used
systemctl stop pve-cluster
/usr/bin/pmxcfs -l
Restart pve01 cluster
systemctl restart pve-cluster
References
Cluster Manager
Wiki - Cluster Manager
Correct procedure for zpool removal
Install *Synology* NAS managed *Let’s Encrypt Certificate* in *NGINX*
Table of Contents
Install Synology NAS managed Let's Encrypt Certificate in NGINX
Certificate Management
Synology NAS can be used for certificate management, and Let's Encrypt certificate can be exported as ZIP file used for NGINX HTTPS configuration.
- Go to Control Panel -> Security -> Certificate
- Select certificate to be exported
- Select Export Certificate from right click menu
- Save exported file
For existing certificates, can use right click -> renew
option to renew.
Note: All domain in the certificates, must be resolved to current Synology NAS at port 80 and port 443, otherwise, certificate generation will be failed.
In downloaded ZIP file, following files can be found.
certs.pem
chain.pem
privkey.pem
NGINX configuration
-
Concatenate
cert.pem
andchain.pem
tocert-with-chain.pem
(orfullchain.pem
) file -
Copy
cert-with-chain.pem
andprivkey.pem
into NGNIXconf.d
folder -
Verify NGINX configuration as below
ssl_certificate conf.d/cert-with-chain.pem;
ssl_certificate_key conf.d/privkey.pem;
- Restart NGINX
Verification
Browser
The date of issue for new certificate should be displayed in certificate information window.
Command line
Following command can be used for verification
openssl s_client -connect <domain_name>:<port>
If got following error, concatenate chain.pem
into cert.pem
, because the full chain is required.
verify error:num=20:unable to get local issuer certificate
verify error:num=21:unable to verify the first certificate
References
Using certbot apply let’s encrypt certificate
Table of Contents
Using certbot apply let's encrypt certificate
In order to use NGINX module, certbot
needs to use it's own NGINX server or it needs to modify the NGINX configuration.
Steps
Preparation
- Shutdown application which listening on port 80 and port 443
docker stop nginx
- Install software if haven't installed
Note: skip this step if packages installed
apt install certbot
apt install python3-certbot-nginx
- Request certificate
Note: do not need to start nginx service, certbot will start it automatically
certbot certonly --nginx -d <domain1> -d <domain2> -d <domain3>
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Certificate not yet due for renewal
You have an existing certificate that has exactly the same domains or certificate name you requested and isn't close to expiry.
(ref: /etc/letsencrypt/renewal/<domain1>.conf)
What would you like to do?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1: Attempt to reinstall this existing certificate
2: Renew & replace the certificate (may be subject to CA rate limits)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Select the appropriate number [1-2] then [enter] (press 'c' to cancel): 2
Renewing an existing certificate for <domain1> and <domain2>
Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/<domain1>/fullchain.pem
Key is saved at: /etc/letsencrypt/live/<domain1>/privkey.pem
This certificate expires on 2023-05-11.
These files will be updated when the certificate renews.
Certbot has set up a scheduled task to automatically renew this certificate in the background.
Deploying certificate
Successfully deployed certificate for <domain1> to /etc/nginx/sites-enabled/default
Successfully deployed certificate for <domain2> to /etc/nginx/sites-enabled/default
Your existing certificate has been successfully renewed, and the new certificate has been installed.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
If you like Certbot, please consider supporting our work by:
* Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate
* Donating to EFF: https://eff.org/donate-le
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- certificate location
Certificate can be found in following directory
ls /etc/letsencrypt/live/domain1/
- stop nginx created by certbot
systemctl stop nginx
systemctl disable nginx
- setup docker certificates
Copy privkey.pem and fullchain.pem into docker configuration directory.
Troubleshooting
All domains in the command lines must be resolved to the running host for both port 80 and port 443, otherwise the certificate can not be created.
Another way
Run certbot docker choud be better as no additional package install, and the certbot service can be stopped using docker command
References
Fix Synology `Allocation Status` Crashed Error
Table of Contents
Fix Synology Allocation Status Crashed Error
I use JBOD for backup volume with checksum turned on, because I don't expect both data on source and backup date lost. The issue of one disk in JBOD volume can cause volume crash, which becomes read only. When checking the the status further, only one disk shows Allocation Status
as Crashed
but Health Status
as Healthy
.
In the pass, due to the faulty volume is in read only status, I need to create new folders with new names and copy all data into new folders, then rebuilt the disk array, and move the volume back to new created volume, which requires reconfiguration of permission and services too, such as NFS, Timemachine, Rsync, etc. It can take days to complete all these tasks.
This time, I tried to recover the volume using a few commands.
Steps
Recreate Array
-
Login into command line of Sysnology as root
-
Find the array
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid5 sda5[0] sdc5[2] sdb5[1]
1943862912 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
md12 : active raid5 sdjc7[5] sdjb7[6] sdjd7[3] sdja7[7] sdje7[8]
1953467648 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]
md9 : active raid5 sdjc6[9] sdjb6[8] sdja6[6] sdjd6[7] sdje6[5]
703225088 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]
md6 : active raid5 sdjc5[6] sdjd5[5] sdjb5[9] sdja5[8] sdje5[7]
1230960384 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]
md4 : active linear sdg3[0] sdh3[2](E) sdf3[1]
2915921472 blocks super 1.2 64k rounding [3/3] [UUE]
md10 : active raid5 sdja8[2] sdje8[3] sdjc8[4]
1953485824 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
md7 : active raid5 sdib6[4] sdie6[5] sdic6[3] sdia6[2] sdid6[1]
3906971648 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]
md3 : active raid5 sdie5[5] sdia5[4] sdid5[3] sdib5[7] sdic5[6]
7794733824 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]
md8 : active raid5 sdie7[0] sdib7[3] sdic7[2] sdia7[1]
2930228736 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
md1 : active raid1 sdh2[5] sdg2[4] sdf2[3] sdc2[2] sdb2[1] sda2[0]
2097088 blocks [8/6] [UUUUUU__]
md0 : active raid1 sdh1[3] sdg1[4] sdf1[2] sda1[0] sdb1[1] sdc1[6]
2490176 blocks [8/6] [UUUUU_U_]
unused devices: <none>
- Collect RAID info
# mdadm --examine /dev/sdh3
/dev/sdh3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 6783225a:318612f7:3473d58a:09a977b2
Name : ds1812:4 (local to host ds1812)
Creation Time : Wed Dec 28 07:04:52 2022
Raid Level : linear
Raid Devices : 3
Avail Dev Size : 3897584768 (1858.51 GiB 1995.56 GB)
Used Dev Size : 0
Data Offset : 2048 sectors
Super Offset : 8 sectors
Unused Space : before=1968 sectors, after=65 sectors
State : clean
Device UUID : 14704640:a5536257:40c4ae47:2f008c53
Update Time : Sat Jan 21 00:36:29 2023
Checksum : 8685d50c - correct
Events : 5
Rounding : 64K
Device Role : Active device 2
Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
root@ds1812:~#
- Unmount the system, if not successful, use force and kill option
# umount -f -k /volume3
- Stop array
# mdadm --stop /dev/md4
- Recreate array, answer the question as
y
# mdadm --create --force /dev/md4 --metadata==1.2 --raid-devices=3 ---level=linear /dev/sdg3 /dev/sdf3 /dev/sdh3 -u6783225a:318612f7:3473d58a:09a977b2
mdadm: ... appears to be part of a raid array:
...
Continue creating array? y
Now, the array has been recreated, and should be in correct state
# cat /proc/mdstat
Check the filesystem and mount it again
The filesystem type is btrfs, so use following command to verify it
# btrfsck /dev/md4
Syno caseless feature on.
Checking filesystem on /dev/md4
UUID: 7a3a3941-e0c4-4505-8981-d309fb9482a5
checking extents
checking free space tree
checking fs roots
checking csums
checking root refs
found 2037124587520 bytes used err is 0
total csum bytes: 1986978456
total tree bytes: 2458648576
total fs tree bytes: 62947328
total extent tree bytes: 50741248
btree space waste bytes: 294577149
file data blocks allocated: 6689106694144
referenced 1995731652608
root@ds1812:/# echo $?
0
Mount the filesystem, now, the Synology error beep should be stopped
mount /volume3
References
How to handle a drive that has "Allocation Status: Crashed"
[HOWTO] repair a clean volume who stays crashed volume
mdadm(8) — Linux manual page
Manualy repair filesystem command line DS214
How to recover from BTRFS errors
Shell command to remove `(1)` from filename
Table of Contents
Shell command to remove (1)
from filename
To compare massive number of files with (1)
in file name, with the original files without (1)
, such as ABCD(1).txt and ABCD.txt, following commands can be used. Beware, they are not steps but commands.
Use bash
substring
- Find out all
*(1)*
files and check whether have original file in same folder.
find . -name "*\(1\)*" | while read line
do
if test -e "${line/(1)/}"; then
echo "$line"
fi
done
Then can clean up them one by one.
-
Move them to another directory
-
Rename them to be the same as original file in same folder
find . -name "*\(1\)*" | while read line
do
if test ! -e "${line/(1)/}"; then
mv "$line" "${line/(1)/}"
fi
done
- Compare them with original files in same folder
Note: This method only work with the original filename has no (1)
string.
Use sed
Following sample script can be used for same task.
#!/bin/bash
find . -name "*" -type f | while read line
do
dname="`dirname -- \"$line\"`"
bname="`basename -- \"$line\"`"
# pattern='s/\(([0-9])\)\./\1/' # remove "." if match "(1).", \1 == ([0-9])
# pattern='s/(\([0-9]\))\./\1/' # remove "(", ")" and "." if match "(1).", \1 == [0-9]
# pattern='s/([0-9]).//' # remove "(1)"+any_char
# pattern='s/[0-9]\.//' # remove "(1)."
# pattern='s/([0-9])\././' # remove "(1)"
pattern='s/\s*([0-9])\././' # remove any_space+"(1)"
# pattern='s/\s*\././' # remove any_space before "."
# pattern='s/^\./11./' # add "11" in front if start with "."
# pattern='s/^01\./10./' # replace starting "01." to "10."
# pattern='s/^0\([2-9]\)\./1\1./' # replace starting "01." to "10."
nname="`echo \"$bname\" | sed -e "$pattern"`"
# echo "$bname"; echo "$nname"
if [ "$nname" != "$bname" -a ! -e "$nname" ] ; then
pushd "$dname"
echo "$bname"; echo "$nname"
mv "$bname" "$nname"
popd
fi
done
Use vim
- Use following command to get the list of file name
find . -name "*(1).*" -exec echo mv ~{}~ ~{}~ \; > list
- Use
vim
to edit the file
vi list
- Use lookahead to replace the last
(1)
%s/.*\zs(1)//
- Replace
~
to"
, then save it
%s/\~/"/g
- Run the script
sh list
References
How to change last occurrence of the string in the line?
Regex lookahead and lookbehind
Query RAM Type in Windows 11
Query RAM Type in Windows 11
To query RAM type for each slot, run following command
wmic memorychip get
Note: High clock speed RAM can be used in low speed computer normally.
References
How to get full PC memory specs (speed, size, type, part number, form factor) on Windows 10
Is there any problem if I use 3200 MHz RAM whereas my motherboard supports up to 2400 MHz?
Unable to query DNS with DOMAIN from `dnsmasq` server
Unable to query DNS with DOMAIN from dnsmasq
server
When doing nslookup, dnsmasq server could not reply the DNS with DOMAIN, but able to reply short dns name only. Following message may appear.
# nslookup www
....
dnsmasq server can't find www.example.com: NXDOMAIN
Solution
The reason is that DNS entries in dnsmasq host file (default is banner_add_hosts
) has no domain name
192.168.1.1 www
In dnsmasq.conf
file
Following lines are required. The expand-hosts
option allows appending the domain name defined in domain
line to short hostname in host file
domain=example.com,192.168.1.0/24
expand-hosts
References
Move MicroSD boot proxmox to eMMC
Move MicroSD boot proxmox to eMMC
Steps
- Manually deplicate partition from MicroSD to eMMC using
fdisk
, ignore the bios partition as EFI partition used. - Unmount old
/boot/efi
partition, then duplicate EFI partition usingdd
from MicroSD, this can keep UUID - Create PV on eMMC data partition and add it to
pve
VG - Move all data from old MicroSD partition to eMMC partition
pvmove /dev/<MicroSD partition>
- Check structure and UUID using following command
lsblk -o +UUID
- Remove MicroSD PV from
pve
VG usingvgreduce
, then usepvremove
to remove PV from MicroSD - Mount new
/boot/efi
partition, then rungrub-install
to recreategrub.cfg
file - Remove MicroSD from system, then reboot