ProxMox notes from a NOOB

Here are a few notes and things that stuck out to me while kicking the tires of ProxMox.  Keep in mind these are coming a VMware Admin with very weak Linux knowledge.   I will keep adding stuff as I learn.

Uses QEMU/KVM for virtualization

LXC for containers

Corosync Cluster Engine for server communications

Proxmox Cluster File System for cluster configuration

 -If installing ProxMox v8 crashes on install, see if v7 works, if it does then do an in place upgrade

-Neither v7.4 or v8.1 seem to recognize Mellanox CX-3 40gbE/InfiniBand network cards

-The vCenter equivalent: it is just built in to the webUI of each host and works on a distrusted/cluster model, IE there is not appliance to install, no software keys, no dedicated IP.  Imagine if the ESXi WebGUI had basic vCenter functions built in, ie joining and managing multiple host in one interface, vMotion, and replication.

-There are oddities about moving VMs back and forth from LVM to/from ZFS.  A VM built on a ZFS volume cannot live migrate, cold migrate to a LVM volume; a template that lives on an LVM volume cannot be spawned to a LVM volume; IF there is a replication job attached.

-by default a network "bridge" is created.  Network cards can be added/subtracted as necessary; very much like the "virtual switch for VMware/ESXi


-The default install, will have several "nag screens" about wanting one to have a paid subscription.  No judgment here: "gotta pay the bills".   The default update repository is the from the paid tier, one must disable it and point updates to the "no-subscription" tier, to get updates and lessen nags. 

-The ProxMox virtual machine tools, actually QEMU (vmtools equivalent) is a separate download.  It must be installed for any "thin provisioning of VM memory.  IE a Windows VM running w/o the tools set to 8gb of RAM will consume 8gb of RAM on the host.  With the tools it will take some amount less.  

-That same ISO (proxmox virtio-win-xxxxxx.iso) will most likely be needed for installing Windows.  Things like the hard disk controller (depending which one was chosen on VM creation) will not be seen and require drivers to be installed.  

-Replication jobs!  If they fail and one wants to delete the job, and they seem to not go away through the GUI.  Go to the shell and type: "pvesr lit" to show the jobs.   Then "pvesr delete JobID --force"

-A template cannot migrate to a different host unless the name of the storage on both servers is the same.  

-A template cannot be cloned (image spawned from) to a host that doesn't have the same storage name.  If the template exists on local storage it cannot be cloned to another host, and the dropdown box for what storage to use is blank. One has to clone the machine on the same server where the template lives, then migrate the cloned VM.

-configuration files for the VMs live here: /etc/pve/qemu-server; if things go way wrong one can move the config file to another host by: 

mv /etc/pve/nodes/<old-node>/qemu-server/<vmid>.conf /etc/pve/nodes/<new-node>/qemu-server

-virtual disks can be accessed here: /dev/<disk name>

Rename VM LVM Storage Name VIA SSH

cd /dev/pve

lvrename /dev/pve/vm-100-disk-0 /dev/pve/vm-297-disk-0

-There does not seem to be an easy way via the GUI to rename storage.

 the 1st icon is running VM, the 2nd a VM that is being migrated, the third is a template. 


v7.4 Cluster
v8.1 Cluster



Things I really like about ProxMox:

-ability to migrate running VM's from one host to another; even without shared storage
-ability to backup VM's
-ability to replicate VM's

Challenges:

-I had an unexpected host failure; the VM tired to migrate to a different node; at the time there is only local storage, not even Ceph..  The local node volumes also where not the same.  After the node came back online, it had the VM's virtual drive, as it couldn't migrate, another server in the cluster had the config file.  Getting things back inline was a real chore.  Simply moving either the virtual hard drive or the config file was not working.  Sure one could blame it on me for not setting up shared storage, or not having the datastores named the same.  However why is HA not doing checks before attempting a migration?

-Another unexpected host failure; two nodes are disconnected from the cluster.....Nodes 1, 2, 3, and 6 are all up and joined together and report nodes 4 and 5 as offline.  Nodes 4 and 5 believe they are online, and the other four nodes are off line.  Removing and re-adding the nodes to a cluster is not straight forward and not doable via the GUI.

------------------------------------

-Abbreviated: Instructions to upgrade from v7 to v8

    -from the shell of a given node type:

    -pve7to8

    -apt update

    -apt dist-upgrade

    -pveversion

    -sed -i 's/bullseye/bookworm/g' /etc/apt/sources.list

    -apt update

    -apt dist-upgrade

------------------------------------

-To remove a damaged node from a cluster:
    From the damaged node:
    -stop and pve cluster services:  systemctl stop pve-cluster
    -stop corosync services:           systemctl stop corosync
    -restart in single host mode:      pmxcfs -l 
    -delete corosync config:            rm /etc/pve/corosync.conf
    -delete corosync folder:            rm -r /etc/corosync/*
    -delete reference to other nodes:  rm -r /etc/pve/nodes/*
    ***VMs living on the damaged server will be lost.....the virtual hard drive will still be there

------------------------------------

qm list (shows VMs)

qm start/reboot/reset/stops vmID (starts/safe shutdown and startup/reboots/hard powers off off a VM)

------------------------------------

To change a server name:
edit the following files:
nano /etc/hosts
nano /etc/hostname
nano /etc/postfix/main.cf
reboot now


No comments:

Post a Comment