Late last year I got a couple of ASUS 2U server’s off eBay which are now racked in my home lab. They came with some Nvidia Tesla K10 GPU compute cards in them, 4 in each, and I wanted to get them setup so that I could use CUDA & potentially rendering with either Blender or Davinci Resolve as a render farm.
As I installed Ubuntu 20.04 in them, I thought it was going to be as simple as just installing CUDA – which it was with a minor setback. The current version of CUDA is 11 which Nvidia decided to remove support for compute level 3.0 devices including the K10’s I have, so I had to install the earlier version.
I’ve been running a Fibre Channel fabric at home for storage for a while now. The fabric contains a 24 cartridge Tape Library, an HP DL380e acting as a san for disk storage and most of my full sized servers are also connected to it.
However, as I’ve just setup a proper home office away from the rack (it’s a lot quieter as the rack is noisy) I needed fibre channel as well as Ethernet as my main workstation’s connected to both.
Extending Ethernet is easy, you can just connect another switch to the network (although I’m changing that, more of that in a future article), but initially I had the workstation setup with it’s own fibre from the fibre switch but that limited me to just the workstation, so if I needed more then it was either lay more fibres or get a second switch to serve the office.
So I got my hands on an IBM 2498-B24/24E 8Gb FC switch with 8 ports licensed. Except it wasn’t, turns out it was fully licensed so has 24 active ports, which is a bonus.
So the two are linked together with a single fibre but at first they didn’t want to talk. This turned out that both switches had the same domain id.
Fabric OS which both switches are running defaults the domain id to 1, so I had to change that on the new switch which is now domain 2:
Disable the switch using the switchdisable command
Run the configure command
When prompted Fabric parameters enter Y
When prompted Domain enter the new domainId, 2 in my case
Just press enter for the remaining options.
Finally once configure has completed run switchenable to re-enable the switch.
Once that was done I did a reboot of the new switch then switchshow and I could see that port 0 was now disabled, showing Disabled (Implicit Platform Service Enable operation blocked) against the port linking the two:
Index Port Address Media Speed State Proto
0 0 020000 id N8 In_Sync FC Disabled (Implicit Platform Service Enable operation blocked)
1 1 020100 id N8 Online FC F-Port 10:00:00:90:fa:ae:0d:a5
It turns out this was the original switch with the MS Platform Management Service being enabled. As the service was running the new switch would refuse to connect as it couldn’t configure itself.
So onn the new switch I ran msplatshow and it told me it was disabled:
*MS Platform Management Service is NOT enabled.
On the original switch it showed it was running & empty:
*MS Platform Management Service DB is empty.
So the fix was simple, just to deactivate the service on the original switch with msplmgmtdeactivate and saying yes to confirm I wanted it to be deactivated. Once done running the command a second time told me it was now disabled.
MS Platform Service is currently enabled.
This will erase MS Platform Service configuration
information as well as database in the entire fabric.
Would you like to continue this operation? (yes, y, no, n): [no] yes
Request to deactivate MS Platform Service in progress……
*Completed deactivating MS Platform Service in the fabric!
MS Platform Service is already disabled!
Next I rebooted the new switch. Once it came back up I ran fabricshow on the core switch and it appeared.
Switch ID Worldwide Name Enet IP Addr FC IP Addr Name
1: fffc01 10:00:00:05:33:49:da:92 192.168.2.62 0.0.0.0 >"Switch"
2: fffc02 10:00:00:05:33:4a:66:53 192.168.3.124 0.0.0.0 "VHC_Switch_D"
The Fabric has 2 switches
So next to check that it’s working, so as root on my workstation I triggered the adapter to rescan for new devices:
# echo 1 > /sys/class/fc_host/host10/issue_lip
then ran lsscsi to see what it found (this doesn’t need root):
[2:0:0:0] cd/dvd hp DVD A DH16AFSH DHH4 /dev/sr0
[4:0:0:0] disk ATA HGST HUS724040AL AC50 /dev/sda
[4:0:1:0] disk ATA CT500MX500SSD1 023 /dev/sdb
[11:0:0:0] disk Linux File-Stor Gadget 0409 /dev/sdc
[12:0:0:0] disk Generic STORAGE DEVICE 1404 /dev/sdd
[13:0:0:0] tape HP Ultrium 5-SCSI I6RW /dev/st0
[13:0:0:1] mediumx HP MSL G3 Series 7.10 /dev/sch0
Brilliant, it’s found the tape library. It’s the last 2 lines, one for the tape and the mediumx is the robotics for changing the tape.
Running mtx it returns the list of tapes in the library, confirming the link is now working:
$ mtx -f /dev/tape/by-id/scsi-DEC07115EW status
Storage Changer /dev/tape/by-id/scsi-DEC07115EW:1 Drives, 24 Slots ( 1 Import/Export )
Data Transfer Element 0:Empty
Storage Element 1:Full :VolumeTag=ARC010L5
Storage Element 2:Full :VolumeTag=ARC011L5
Storage Element 3:Full :VolumeTag=ARC012L5
Storage Element 4:Full :VolumeTag=ARC013L5
Storage Element 5:Full :VolumeTag=ARC014L5
Storage Element 6:Full :VolumeTag=ARC008L5
Storage Element 7:Full :VolumeTag=ARC009L5
Storage Element 8:Empty
Storage Element 9:Empty
Storage Element 10:Empty
Storage Element 11:Empty
Storage Element 12:Full :VolumeTag=ARC001L5
Storage Element 13:Full :VolumeTag=ARC002L5
Storage Element 14:Full :VolumeTag=ARC003L5
Storage Element 15:Full :VolumeTag=ARC004L5
Storage Element 16:Full :VolumeTag=ARC005L5
Storage Element 17:Full :VolumeTag=ARC006L5
Storage Element 18:Full :VolumeTag=ARC007L5
Storage Element 19:Empty
Storage Element 20:Empty
Storage Element 21:Empty
Storage Element 22:Empty
Storage Element 23:Full :VolumeTag=CLN001CU
Storage Element 24 IMPORT/EXPORT:Empty
There’s a lot more you can do with these switches, for example creating zones to partition the network like you can with VLAN’s in Ethernet but for my use-case this is is perfect, and the single 8Gb connection between the two should be enough bandwidth – most of my storage on SAS drives doesn’t run at 8Gb, but if it did become an issue I can always try to increase the link by bonding multiple fibres – if that’s even possible, would have thought it would be.