Skip to Content.
Sympa Menu

sphenix-run-l - Re: [Sphenix-run-l] prdf files in sdcc

sphenix-run-l AT lists.bnl.gov

Subject: Commissioning and running of sPHENIX

List archive

Chronological Thread  
  • From: Martin Purschke <purschke AT bnl.gov>
  • To: sphenix-run-l AT lists.bnl.gov
  • Subject: Re: [Sphenix-run-l] prdf files in sdcc
  • Date: Thu, 25 May 2023 09:57:19 -0400

Ok, for reference I'll send this to the list -

there is a top-level script "submit_file_farm.sh" that takes a file as an argument and *submits* a transfer job.

(There are other submit scripts that submit for HPSS transfer or both, please refrain from using those at this point).

The worker script that is eventually run and does the heavy lifting is called "transfer_farm.sh". It makes a deliberately simple check and quits if the destination file already exists AND has the same size. It does not perform a more extensive (and expensive) check for identity.

So I ran

find /bbox/commissioning/tpc/pedestal/ -type f -print -exec sh
submit_file_farm.sh {} \;

and, when done, will end up with 54824 jobs, of which 40 run concurrently.

The submit itself will take a while, I currently see

2161 jobs; 0 completed, 0 removed, 2121 idle, 40 running, 0 held, 0 suspended

Because everything happens from just one bufferbox here, we get about 3.2GB/s throughput. Adequate for the first full volley in 6 months I'd say.

Best,
Martin



On 5/25/23 08:47, Martin Purschke wrote:
Chris,

nah, that script is for when the files are (as we'll do once we run for real) in the /bbox/bbox{0,1,2,3,4,5} directories that are pinned to a given server. Commissioning is NOT, so it doesn't make sense to use that here.

The server-pinned directories avoid generating internal network traffic - this ensures that when bbox0 transfers data out of /bbox/bbox0/...., those files reside physically on disks hosted by bbox0. Else you would end up dragging data from the other bboxes to bbox0 through our local network and then on to the SDCC, which doesn't make sense.

For the commissioning we didn't do that on purpose to avoid the management overhead (and b/c of the relative small dataset).

BTW, even that script is a quick-and-dirty hack from the PD-4 review era where I needed to demonstrate this - we cannot use ssh, but that change is for a quiet(er) day.

Let me try quick.

    Martin


On 5/25/23 08:34, pinkenburg wrote:
Hi Martin,

that was the original plan but the striping which is done for the volume behind that directory adds load to lustre (partial files get scattered among the servers). We thought that we want to avoid this for now. That's why I copied them into a different area. For the commissioning it's fine, the striping will help when we have hundreds of reconstruction processes reading events from the same file (gl1 comes to mind)

I am trying to revive the condor transfer for the tpc under /home/sphnxpro/mlp_transfer/tpc. executing the script (which uses cp) with

./transfer_only_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0000.prdf

worked just fine but condor needs some TLC, none of those single file jobs started:

1078  ssh bbox0.sphenix.bnl.gov 'cd mlp_transfer/tpc ; sh submit_file_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0001.prdf'
  1080  ssh bbox0.sphenix.bnl.gov 'cd mlp_transfer/tpc ; sh submit_file_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0001.prdf'
  1087  ssh bbox1.sphenix.bnl.gov 'cd mlp_transfer/tpc ; sh submit_file_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0002.prdf'
  1088  ssh bbox2.sphenix.bnl.gov 'cd mlp_transfer/tpc ; sh submit_file_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0003.prdf'
  1089  ssh bbox4.sphenix.bnl.gov 'cd mlp_transfer/tpc ; sh submit_file_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0004.prdf'
  1090  ssh bbox5.sphenix.bnl.gov 'cd mlp_transfer/tpc ; sh submit_file_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0005.prdf'
  1091  ssh bbox3.sphenix.bnl.gov 'cd mlp_transfer/tpc ; sh submit_file_farm.sh /bbox/commissioning/tpc/pedestal/TPC_ebdc23_pedestal-00010635-0006.prdf'

I'll be out in 1008 in about an hour or so

Chris

On 5/25/2023 8:26 AM, Martin Purschke wrote:
Chris,

I lost track what the different areas in the SDCC lustre denote - weren't we supposed to write to /sdcc/sphnxpro/rawdata/commissioning?

For the TPC data, I would suggest we use (and if needed, test and refine) the condor-based transfer. (Let me try after we clarify the destination).

I'm sure that people have lost track of how the procedure works. The idea is, exactly like in the PHENIX days, to submit a dedicated condor job for each file to be transferred. The sole role of condor is to regulate (and make it easy to control) how many files per bbox are sent concurrently. 40 is the current limit. So you have, say, 1000 files, and accordingly submit 1000 condor jobs. 40 of them run until there are no more. Kick off the transfer and forget. (And also, this uses cp instead of rsync).

Please let me know.

 - Martin


On 5/25/23 08:11, pinkenburg wrote:
Hi folks,

sdcc made some changes yesterday and I am trying again and this limited rsync went through:

rsync -av --exclude '*/junk/*' --exclude 'tpc/*' --exclude 'mlp_test/*' --exclude '*.txt' --exclude '*.json' --exclude '*/cosmics/*' /bbox/commissioning/* /sdcc/sphnxpro/commissioning

it was just about 4TB, we do have mostly junk and a lot of cosmics for the hcals and the mvtx dumps a lot of tiny non prdf files into lustre. The transferred files can be found in sdcc under

/sphenix/lustre01/sphnxpro/commissioning/

having the same subdirectory structure. This is the non striped lustre volume but the performance for our current needs should be fine. I am working on the transfer of the tpc data which are too large for a simple single rsync and therefore excluded above.

Chris








--
Martin L. Purschke, Ph.D. ; purschke AT bnl.gov
; http://www.phenix.bnl.gov/~purschke
;
Brookhaven National Laboratory ; phone: +1-631-344-5244
Physics Department Bldg 510 C ; fax: +1-631-344-3253
Upton, NY 11973-5000 ; skype: mpurschke
-----------------------------------------------------------------------




Archive powered by MHonArc 2.6.24.

Top of Page