sphenix-maps-l AT lists.bnl.gov
Subject: sPHENIX MAPS tracker discussion
List archive
- From: Martin Purschke <purschke AT bnl.gov>
- To: sphenix-mvtx-l <sphenix-maps-l AT lists.bnl.gov>
- Subject: [Sphenix-maps-l] ebdc11, 12...
- Date: Tue, 10 Jan 2023 13:53:51 -0500
MVTXers -
Cameron and I spoke biefly yesterday about getting more MVTX nodes up and running (first to see that all FELIXes work in their future hosts).
I'd like to hold on to the nodes in the lower rack in their current state (== OS) for a little bit longer, since I run the imminent MDC with a multiple of 6 machines for proper load-balancing (running 48, then 42 in the past as we were dedicating more machines to other tasks). That doesn't prevent us from installing the FELIXes to do the smoke test, and then start cloning the OS when the MDC is done.
There are a few issues with that cloning.
First, ebdc11 is not yet in a final state. I needs to switch to a lustre-aware kernel so it can talk to the buffer boxes (that is, lustre-mount the file system). This kernel version difference will affect the FELIX driver. seb01, for example, has
[root@seb01 ~]# uname -a
Linux seb01 3.10.0-1160.49.1.el7_lustre.x86_64 #1 SMP Fri Jun 17 18:46:08 UTC
2022 x86_64 x86_64 x86_64 GNU/Linux
This isn't too big a deal, it just takes rebuilding the driver against this kernel version.
Second, and that's a bit more serious, ebdc11 has been installed with the Redhat system-setup defaults. It uses the devicemapper (for no apparent reason with just one NVME system disk), and that makes it basically not cloneable. I never do that for that exact reason (in addition, this defeats or at least super-complicates any attempt to get at the system disk from a rescue system to fix a problem when the machine won't boot).
So I'd say that instead of cloning we make a new machine the right way with lustre support and all, see that it works just like ebc11 + lustre etc, and then we start cloning from that master then. Once we are happy, we re-clone ebdc11 that we can keep as a reference until last.
BTW, we should also revisit the host naming / numbering scheme since I just pulled the number 11 out of a hat back then. We don't have to call them ebdc if a different name makes more sense and avoids confusion.
Best,
Martin
--
Martin L. Purschke, Ph.D. ; purschke AT bnl.gov
; http://www.phenix.bnl.gov/~purschke
;
Brookhaven National Laboratory ; phone: +1-631-344-5244
Physics Department Bldg 510 C ; fax: +1-631-344-3253
Upton, NY 11973-5000 ; skype: mpurschke
-----------------------------------------------------------------------
-
[Sphenix-maps-l] ebdc11, 12...,
Martin Purschke, 01/10/2023
- Re: [Sphenix-maps-l] ebdc11, 12..., Ming Liu, 01/10/2023
-
Message not available
- Re: [Sphenix-maps-l] ebdc11, 12..., Martin Purschke, 01/10/2023
- Re: [Sphenix-maps-l] [EXTERNAL] ebdc11, 12..., Schambach, Jo, 01/10/2023
Archive powered by MHonArc 2.6.24.