SuperMicro IPMIView

IPMI (Intelligent Platform Management Interface) is a pseudo standard for managing, accessing, and configuring servers.  Companies like SuperMicro, Quanta, Celestica, are petty good about sticking to those standards, even HP and Dell allow limited use of them.  One might also see/hear the term BMC, which is really just the hardware that runs the IPMI interface.  

When dealing with older hardware it can be a real PITA to administer machines.  Things like getting the remote KVM/Console to work with modern day browsers can be a real headache. Each manufacture, and often each model have their own idiosyncrasies. 

IPMItools, is a software package one can download, install, and issue commands a remote machine.  Think of it like 'PowerCLI" for the BMC.  Now enter SuperMicro's IPMIView.  It takes IPMItools one step further.   It is a GUI that allow one to add in many systems into a control panel, flip through those systems, and do remote access, management, and configuration.  IMHO the biggest feature is having an easy way to get at the remoteKVM/console of the machine.   No more making SSL and security exceptions for each machine one administers.  Hence the name, this tool is made for SuperMicro systems, however since it operates with industry standards it also functions with many other manufactures.

https://www.supermicro.com/en/solutions/management-software/ipmi-utilities

IPMItools does require Java to run, and it make require some initial security settings, but it beats making changes for every single machine. Each system is a bit different so somethings work, some don't....IE hardware monitoring. 






Firmware updates of Dell PowerConnect 6224/6248 switches

Who fools around with switches that are over 15 years old?  Well I do!  They still work, and when basic 1gb connectivity is all that is needed they still work, why not?  Besides new 24 port managed switches aren't exactly free.

Often the firmware on these switches is so old and insecure, that even if the Web GUI is enabled it's a struggle to get it function properly. Thus it is necessary to do the operation from either an SSH session or preferably the local console/serial port.  The newest firmware v3.3.18 is from 2019, so it isn't great in terms of being updated, but it's better than the original 2007 code! 

Here is the basic steps to update the firmware on them. 

1. Obtain and unpack the firmware from Dell.  Make note of the location of the files. https://www.dell.com/support/home/en-us/product-support/product/powerconnect-6224/drivers

2. Obtain and install a terminal software (i.e. Putty or TeraTerm).  Connect to the switch.

3. Obtain and install a TFTP server software (i.e. TFTpd64), point the server to the location of the unpacked firmware files.

4. from the SSH session, enter these commands:

-en

-copy tftp://address-of-tftp-server/firmware-file-name.stk image

-show ver (make note of which image has the newer firmware)

-boot system image1  (or image2 depending on your system)

-copy running-config startup-config

-reload

5. IF GOING FROM FIRMWARE v2.X TO v3.X; During the boot process, choose option #2 to get the alternate boot menu, then choose option #7 "update the boot code", then normal boot.  Failure to do this will cause the system to boot loop.

Dell PowerEdge r520 dual vs single CPU

 If one every decides to add a 2nd CPU to a Dell PowerEdge r520, besides the CPU, CPU sink, extra system fan, one also needs a different PCIe riser card.

What?

Turns out the riser cards are different!  If one puts a 2nd CPU in to a system without replacing the riser, the system works, however the system LCD will be amber, flashing, and giving an error: :HWC2005 system board riser cable interconnect failure".  The machine still functions just fine, it just has the annoying alerts.




Dell PowerEdge: unable to Web into iDRAC after upgrades

 
After upgrading Dell x40 PowerEdge servers the web interface for IDRAC may give the above error.  Fortunately one can still get at it using the IP address. 

The issue is that the iDRAC webserver is checking the name in header with what is programmed in the IDRAC web server.  One can either disable this check (which is what I choose, as my servers get moved re-purposed quite often), or set the value to match the FQDN.
1: SSH into the host
2: racadm set idrac.webserver.HostHeaderCheck 0




VMware iSER iSCSI targets showing up as "unconsummed"

I was swapping out an ESXi host, and I couldn't get our iSCSI target to mount.  The paths and targets would show up, however it would say the volume was "un-consumed".   Which is odd as the datastores in question are in use by other servers and many VMs live on them.  I do have one storage server that has issues with it's identifier due to a signature mismatch, and I have to forcefully mount it.  However this time the same behavior wasn't present.

Normally from the ESXi shell I issue:

esxcfg-volume -l   this will spit out all of the volumes and some basic details visible to ESXi, normally I would see the volume in question, however this time no volumes where present.

Turns out the issue was MTU!  The switch ports, the ESXi kernel ports, the storage server were all set to MTU 9000.  However, I missed the virtual switch!  It was still at 1500, changed it to MTU 9000, rescan the storage adapter and all is good.

iSER & RDMA don't fragment their packets, they just drop them!  



Windows 2019 Software RAID

 Just a quick test...This server has U.2 Samsung MZWLL1T6HEHP NVMe drives, a fresh install of Windows 2019 server.  For the first test I took the three NVMe drives and created a dynamic volume (software RAID5).  Then I repeated the test of a single drive.  

three drive, RAID5 software

Single drive


SuperMicro servers featuring PCIe 5.0 and 200gbps NICs

 Hot off of the factory floor!  SuperMicro ASG-1115S-NE316R

AMD Epyc 9004 series CPU




e3.s form factor drives.  For the record I dislike this form factor because I fear the "fingers" the drive interface extends beyond the body of the drive.  I can see some of these getting broken by sloppy and mistreatment.



SuperMicro AI GPU Server

 

SuperMicro ARS-111GL-NHR  G1SM-G

Nvidia A02 72 Core GH GraceHopper 3.4ghz CPU

16896 GPU cores

https://www.supermicro.com/en/products/motherboard/g1smh-g

https://www.supermicro.com/en/products/system/GPU/1U/ARS-111GL-NHR



A little concerned about these human hair thick wires just ran across the front






The server is physically very long, some cheating to get the server rack door to close with 200gb DAC cables.

RAID5 vs RAID10

Quick and dirty test.  The Guinee pig is a Lenovo ThinkSystem SR655.  With an Avago RAID 930-24i card with 4gb cache.  Four Koxia 1tb SAS 12gbs SSDs.  VMware ESXi v8 is installed on to the host.  The four drives were set up into a RAID5 configuration, read ahead enabled, drive cache is enabled, cache policy is set to Write Back.   The RAID virtual drive was presented to VMware; a Microsoft Developer Windows 11 VM was imported to the datastore using thick provisioning.   Atto disk benchmark software was ran with both 4gb and 8gb tests.  Then, the VM, datastore, virtual drive, was torn down, rebuilt as RAID10, and retested.





I found the results somewhat surprising.   We often hear about the "RAID 5 write penalty", or "RAID 10 is just faster", etc., etc..  Well this test shows the opposite to be true.  The write speed on RAID5 is actually better!  One theory is that well, three drives are doing the work of writing in RAID5 where as in the RAID10 only two drives are writing.