Windows 2019 Software RAID

 Just a quick test...This server has U.2 Samsung MZWLL1T6HEHP NVMe drives, a fresh install of Windows 2019 server.  For the first test I took the three NVMe drives and created a dynamic volume (software RAID5).  Then I repeated the test of a single drive.  

three drive, RAID5 software

Single drive


SuperMicro servers featuring PCIe 5.0 and 200gbps NICs

 Hot off of the factory floor!  SuperMicro ASG-1115S-NE316R

AMD Epyc 9004 series CPU




e3.s form factor drives.  For the record I dislike this form factor because I fear the "fingers" the drive interface extends beyond the body of the drive.  I can see some of these getting broken by sloppy and mistreatment.



SuperMicro AI GPU Server

 

SuperMicro ARS-111GL-NHR  G1SM-G

Nvidia A02 72 Core GH GraceHopper 3.4ghz CPU

16896 GPU cores

https://www.supermicro.com/en/products/motherboard/g1smh-g

https://www.supermicro.com/en/products/system/GPU/1U/ARS-111GL-NHR



A little concerned about these human hair thick wires just ran across the front






The server is physically very long, some cheating to get the server rack door to close with 200gb DAC cables.

RAID5 vs RAID10

Quick and dirty test.  The Guinee pig is a Lenovo ThinkSystem SR655.  With an Avago RAID 930-24i card with 4gb cache.  Four Koxia 1tb SAS 12gbs SSDs.  VMware ESXi v8 is installed on to the host.  The four drives were set up into a RAID5 configuration, read ahead enabled, drive cache is enabled, cache policy is set to Write Back.   The RAID virtual drive was presented to VMware; a Microsoft Developer Windows 11 VM was imported to the datastore using thick provisioning.   Atto disk benchmark software was ran with both 4gb and 8gb tests.  Then, the VM, datastore, virtual drive, was torn down, rebuilt as RAID10, and retested.





I found the results somewhat surprising.   We often hear about the "RAID 5 write penalty", or "RAID 10 is just faster", etc., etc..  Well this test shows the opposite to be true.  The write speed on RAID5 is actually better!  One theory is that well, three drives are doing the work of writing in RAID5 where as in the RAID10 only two drives are writing.  


Burnt up Network Card?

 I have seen this in decades.  This Mellanox 100gb card was causing the server to be unstable.  Now, I don't know when the card went bad.  The server was repurposed, and the 100gb NIC was added.  So I don't know if my co-worker put the NIC in already burnt, or it burnt up in this Dell R620 Power Edge

Notice the discoloration in the lower right corner, and that capacitor is also a different color.




Enterprise SAS SSD vs Consumer SSD

 The time was allocated to do a quick test to compare consumer grade SATA SSD drives to Enterprise grad SAS drives.   The SAS drives are 12Gbps, the SAS card is also 12Gbps.  The tests were done on the same Windows 10 desktop with the same LSI SAS card, expect where noted a PowerEdge r730 was used.

Baseline...Samsung Evo 840 1tb SSD on 12gb SAS controller, Windows 10

Samsung MZL1ls960 960gb SAS on 12gb SAS controller, Windows 10

Samsung MZL1ls960 960gb SAS on 12gb SAS controller, on Dell r730 PowerEdge on Perc H730 RAID card w/ 1gb cache, test set to 2gb, Windows Server 2019


Dell MZ-1LT3T8c 3.8tb SAS on 12gb SAS controller, Windows 10


Impressive results!  12gb SAS SSD drives are very near NVMe performance, and nearly double that of the SATA drives.  Is the performance due to having more cache?  Or is it because the SAS drives are 12Gbps vs 6Gbps on the SATA drives?  It should be noted that during testing the SAS drives consume roughly 2 more watts at idle 5 more watts during the test.   The SAS drives were also MUCH warmer to touch where the SAS drives were ambient temperature. 

Dell x30 PowerEdge Servers running NVMe

 I always assumed that generation 13 PowerEdge Server (r430, 530, 630, 730, 930 etc.)  where not able to run NVMe natively.  Many people run the M.2 "gum-stick" form factor drives on PCIe->NVMe adapter cards. 

As it turns out one can!   The parts are not cheap, but it basically a special PCIe controller card (Dell P31H2 16 Port PCIe x16 Extender SSD NVME Controller), SAS cables, and a special back plane.  The PCIe card runs four cables out to "special-ish" ports on the back plane, and four of the 2.5" drive bays can now run U.2 NVMe drives.















Windows 11 & Hardware

 Windows 11 has several hardware requirements that if they are not met, one cannot install the operating system.   Or can you? 

  
Windows 11 requires the following minimum requirements:

-TPM 2.0 modules (Trusted Platform Module (encryption))

-1ghz 64bit CPU

-64gb hard drive

-4gb RAM

-Graphics card compatible w/ DirectX12 and WDDM 

-720p or better display

Seems more than reasonable, except, the real bugger for most of this is the TPM part.  That rules out basically any machine that is older an Intel I-series generation 8.   So let's say one has a computer with an Intel i7-7700? Let's say one has a Dell PowerEdge r640 server running VMware ESXI v7 with Intel Xeon 414r CPU's?  Well those machines are more than fast enough, but they doesn't pass the TPM check; so Microsoft says: "Go pound sand, and then go buy a new PC.  Also what if one is doing virtualization?    

Well on can bypass this hardware check. There is no downside to this; the machine will run Windows 11 without problems.  It isn't going to BlueScreen, it isn't going to be slow, it isn't going to blow up; well at least not because of pseudo hardware requirements.  I have done it probably a dozen of times and have yet to run into any issues related to the hardware requirements.  One machine has even been running for since early 2022.

After booting to the Win11 ISO, at the very first screen where it asks about Language, time/currency, and Keyboard.  Hit "shift" and "F10".  This will bring up a Command Prompt.  

In that Command Prompt window, type: "regedit"

In RegEdit,navigate to : HKEY_LOCAL_MACHINE\SYSTEM\Setup

Create a new key named: "LabConfig"

In LabConfig create the following keys of DWORD value set to "1"

-BypassCPUCheck

-BypassRAMCheck

-BypassSecureBootCheck

-BypassTPMCheck

Close the Regedit and Command Prompt windows and install Windows as normal.  

https://www.tomshardware.com/how-to/bypass-windows-11-tpm-requirement


Bonus:  If one is using a USB key to install Windows, make the changes, and those settings will stick on the install media, as it is "writeable".  So one only has to do these steps once.

Bonus #2: If one doesn't want to deal with associating the machine to a Online Microsoft Account, simply don't have the machine connected to the internet until one is done installing Windows.  In other words; don't connect it to the internet until after setup is complete.

ProxMox notes from a NOOB

Here are a few notes and things that stuck out to me while kicking the tires of ProxMox.  Keep in mind these are coming a VMware Admin with very weak Linux knowledge.   I will keep adding stuff as I learn.

Uses QEMU/KVM for virtualization

LXC for containers

Corosync Cluster Engine for server communications

Proxmox Cluster File System for cluster configuration

 -If installing ProxMox v8 crashes on install, see if v7 works, if it does then do an in place upgrade

-Neither v7.4 or v8.1 seem to recognize Mellanox CX-3 40gbE/InfiniBand network cards

-The vCenter equivalent: it is just built in to the webUI of each host and works on a distrusted/cluster model, IE there is not appliance to install, no software keys, no dedicated IP.  Imagine if the ESXi WebGUI had basic vCenter functions built in, ie joining and managing multiple host in one interface, vMotion, and replication.

-There are oddities about moving VMs back and forth from LVM to/from ZFS.  A VM built on a ZFS volume cannot live migrate, cold migrate to a LVM volume; a template that lives on an LVM volume cannot be spawned to a LVM volume; IF there is a replication job attached.

-by default a network "bridge" is created.  Network cards can be added/subtracted as necessary; very much like the "virtual switch for VMware/ESXi


-The default install, will have several "nag screens" about wanting one to have a paid subscription.  No judgment here: "gotta pay the bills".   The default update repository is the from the paid tier, one must disable it and point updates to the "no-subscription" tier, to get updates and lessen nags. 

-The ProxMox virtual machine tools, actually QEMU (vmtools equivalent) is a separate download.  It must be installed for any "thin provisioning of VM memory.  IE a Windows VM running w/o the tools set to 8gb of RAM will consume 8gb of RAM on the host.  With the tools it will take some amount less.  

-That same ISO (proxmox virtio-win-xxxxxx.iso) will most likely be needed for installing Windows.  Things like the hard disk controller (depending which one was chosen on VM creation) will not be seen and require drivers to be installed.  

-Replication jobs!  If they fail and one wants to delete the job, and they seem to not go away through the GUI.  Go to the shell and type: "pvesr lit" to show the jobs.   Then "pvesr delete JobID --force"

-A template cannot migrate to a different host unless the name of the storage on both servers is the same.  

-A template cannot be cloned (image spawned from) to a host that doesn't have the same storage name.  If the template exists on local storage it cannot be cloned to another host, and the dropdown box for what storage to use is blank. One has to clone the machine on the same server where the template lives, then migrate the cloned VM.

-One down fall of a cluster, if one has a cluster and more than 50% of the hosts are offline, one cannot start a VM.  So say for instance there is a hardware/power failure, something that brings down half of the hosts.  A VM is off that needs to be powered on.  If the cluster doesn't have quorum, the VM's won't start!

-configuration files for the VMs live here: /etc/pve/qemu-server; if things go way wrong one can move the config file to another host by: 

mv /etc/pve/nodes/<old-node>/qemu-server/<vmid>.conf /etc/pve/nodes/<new-node>/qemu-server

-virtual disks can be accessed here: /dev/<disk name>

Rename VM LVM Storage Name VIA SSH

cd /dev/pve

lvrename /dev/pve/vm-100-disk-0 /dev/pve/vm-297-disk-0

-There does not seem to be an easy way via the GUI to rename storage.

 the 1st icon is running VM, the 2nd a VM that is being migrated, the third is a template. 


v7.4 Cluster
v8.1 Cluster



Things I really like about ProxMox:

-ability to migrate running VM's from one host to another; even without shared storage
-ability to backup VM's
-ability to replicate VM's
-no need for a dedicated management VM/appliance

Challenges:

-I had an unexpected host failure; the VM tired to migrate to a different node; at the time there is only local storage, not even Ceph..  The local node volumes also where not the same.  After the node came back online, it had the VM's virtual drive, as it couldn't migrate, another server in the cluster had the config file.  Getting things back inline was a real chore.  Simply moving either the virtual hard drive or the config file was not working.  Sure one could blame it on me for not setting up shared storage, or not having the datastores named the same.  However why is HA not doing checks before attempting a migration?

-Another unexpected host failure; two nodes are disconnected from the cluster.....Nodes 1, 2, 3, and 6 are all up and joined together and report nodes 4 and 5 as offline.  Nodes 4 and 5 believe they are online, and the other four nodes are off line.  Removing and re-adding the nodes to a cluster is not straight forward and not doable via the GUI.

------------------------------------

-Abbreviated: Instructions to upgrade from v7 to v8

    -from the shell of a given node type:

    -pve7to8

    -apt update

    -apt dist-upgrade

    -pveversion

    -sed -i 's/bullseye/bookworm/g' /etc/apt/sources.list

    -apt update

    -apt dist-upgrade

------------------------------------

-To remove a damaged node from a cluster:
    From the damaged node:
    -stop and pve cluster services:  systemctl stop pve-cluster
    -stop corosync services:           systemctl stop corosync
    -restart in single host mode:      pmxcfs -l 
    -delete corosync config:            rm /etc/pve/corosync.conf
    -delete corosync folder:            rm -r /etc/corosync/*
    -delete reference to other nodes:  rm -r /etc/pve/nodes/*
    ***VMs living on the damaged server will be lost.....the virtual hard drive will still be there

------------------------------------

qm list (shows VMs)

qm start/reboot/reset/stops vmID (starts/safe shutdown and startup/reboots/hard powers off off a VM)

------------------------------------

To change a server name:
edit the following files:
nano /etc/hosts
nano /etc/hostname
nano /etc/postfix/main.cf
reboot now

------------------------------------
service pve-cluster stop
service corosync stop
service pvestatd stop
service pveproxy stop
service pvedaemon stop
and then
service pve-cluster start
service corosync start
service pvestatd start
service pveproxy start
service pvedaemon start

--------------------------
If one has a cluster and some of the nodes are off line for a long time, like if one is attempting to be energy usage conscious, or similar; when that node comes up, the cluster database will be out of sync and the recently powered up node will think it has a master copy of the cluster configuration.   Be patient and wait.  The nodes will sort themselves out and the copy of the cluster database with the highest revision number will be replicated around.

------------------------
Clusters & Quorum!
If one has a cluster with 50% or more of the server nodes down, VMs cannot be changed or powered up.  The Cluster database needs to have half of the servers plus one up and online.  In my case I had six nodes, all of the VMs were consolidated onto three nodes, the vacated hosts where shutdown to save energy.   This caused problems, until a 4th host was re-powered back on.   This could make for very interesting disaster recovery scenarios.