Skip to Content.
Sympa Menu

sphenix-production-l - [[Sphenix-production-l] ] Some failed runs.

sphenix-production-l AT lists.bnl.gov

Subject: Sphenix-production-l mailing list

List archive

Chronological Thread  
  • From: Alex Lebedev <lebedev AT iastate.edu>
  • To: sphenix-production-l AT lists.bnl.gov
  • Subject: [[Sphenix-production-l] ] Some failed runs.
  • Date: Tue, 15 Oct 2024 09:59:24 -0500

There were not so many failed jobs recently.

TRKR_CLUSTER jobs failed for run 54167.
You can find log files in:
/sphenix/data/data02/sphnxpro/trackinglogs/new_2024p007/run_00054100_00054200/
Here is one example of log file: /sphenix/data/data02/sphnxpro/trackinglogs/new_2024p007/run_00054100_00054200/DST_TRKR_CLUSTER_run2auau_new_2024p007-00054167-00090.out
and condor log: /tmp/trkrlogs/new_2024p007/run_00054100_00054200/DST_TRKR_CLUSTER_run2auau_new_2024p007-00054167-00090.condor

From the logs it looks like the jobs run out of memory.

There are also a few TRKR_SEED jobs which failed:
71428.662 10/15 0+00:26:15 DST_TRKR_SEED_run2auau_new_2024p007 54165 81
71428.682 10/15 0+03:26:31 DST_TRKR_SEED_run2auau_new_2024p007 54166 79
71428.683 10/15 0+00:36:15 DST_TRKR_SEED_run2auau_new_2024p007 54166 80
71428.704 10/15 0+01:56:23 DST_TRKR_SEED_run2auau_new_2024p007 54169 60
71298.1418 10/14 0+00:06:21 DST_TRKR_SEED_run2auau_new_2024p007 54166 53
71129.1148 10/14 0+02:01:23 DST_TRKR_SEED_run2auau_new_2024p007 54378 0

Please have a look.

We also have a problem with jet QA jobs crashing due to memory leak,
but Derek A. is already looking into it.

Sasha.




  • [[Sphenix-production-l] ] Some failed runs., Alex Lebedev, 10/15/2024

Archive powered by MHonArc 2.6.24.

Top of Page