Reinstall Proxmox VE node in cluster
After node pve01 in Proxmox VE cluster crushed, reinstall new pve01 in same hardware.
Install PVE using ISO
This just follows the normal installation steps.
Try and error
Tried many tries, end up using following steps to add the replacement node.
- In the any old node, which is not the node itself, run following to del the node from cluster
pvecm delnode <old_node>
- Remove old node known host from all other nodes
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "<old_node>"
ssh-keygen -f "/root/.ssh/known_hosts" -R "<old_node>"
- In the new node, run
pvecm add <existing_node>
pvecm updatecerts
- Update vote for new node (optional)
Edit file /etc/pve/corosync.conf
change the vote number.
- Import old local pools
zpool import -f <old_local_pool>
Change expected votes
Run following commands to check and set acceptable votes in existing node in the cluster
pvecm status
pvecm expected 3
Remove old node
pvecm delnode pve01
Remove old ssh know host
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "pve01"
ssh-keygen -f "/root/.ssh/known_hosts" -R "pve01"
or manual edit two files.
Add node
Run following command in NEW NODE
pvecm add <existing_node>
Sync certs
pvecm updatecerts
Test SSH key authentication
Make sure SSH Key authentication is working
Copy UI certificate
cp /etc/pve/nodes/pve02/pveproxy-ssl.* /etc/pve/nodes/pve01
Remove local-zfs
filesystem
If the previous node was using zfs, then now change to ext4, local-zfs
needs to be removed.
vi /etc/pve/storage.cfg
If need to disable cluster, following command can be used
systemctl stop pve-cluster
/usr/bin/pmxcfs -l
Restart pve01 cluster
systemctl restart pve-cluster
References
Cluster Manager
Wiki - Cluster Manager
Correct procedure for zpool removal