sphenix-production-l AT lists.bnl.gov
Subject: Sphenix-production-l mailing list
List archive
- From: Alex Lebedev <lebedev AT iastate.edu>
- To: sphenix-production-l AT lists.bnl.gov
- Subject: [[Sphenix-production-l] ] Some failed runs.
- Date: Tue, 15 Oct 2024 09:59:24 -0500
There were not so many failed jobs recently.
TRKR_CLUSTER jobs failed for run 54167.
You can find log files in:
/sphenix/data/data02/sphnxpro/trackinglogs/new_2024p007/run_00054100_00054200/
Here is one example of log file: /sphenix/data/data02/sphnxpro/trackinglogs/new_2024p007/run_00054100_00054200/DST_TRKR_CLUSTER_run2auau_new_2024p007-00054167-00090.out
and condor log: /tmp/trkrlogs/new_2024p007/run_00054100_00054200/DST_TRKR_CLUSTER_run2auau_new_2024p007-00054167-00090.condor
From the logs it looks like the jobs run out of memory.
There are also a few TRKR_SEED jobs which failed:
71428.662 10/15 0+00:26:15 DST_TRKR_SEED_run2auau_new_2024p007 54165 81
71428.682 10/15 0+03:26:31 DST_TRKR_SEED_run2auau_new_2024p007 54166 79
71428.683 10/15 0+00:36:15 DST_TRKR_SEED_run2auau_new_2024p007 54166 80
71428.704 10/15 0+01:56:23 DST_TRKR_SEED_run2auau_new_2024p007 54169 60
71428.682 10/15 0+03:26:31 DST_TRKR_SEED_run2auau_new_2024p007 54166 79
71428.683 10/15 0+00:36:15 DST_TRKR_SEED_run2auau_new_2024p007 54166 80
71428.704 10/15 0+01:56:23 DST_TRKR_SEED_run2auau_new_2024p007 54169 60
71298.1418 10/14 0+00:06:21 DST_TRKR_SEED_run2auau_new_2024p007 54166 53
71129.1148 10/14 0+02:01:23 DST_TRKR_SEED_run2auau_new_2024p007 54378 0
Please have a look.
We also have a problem with jet QA jobs crashing due to memory leak,
but Derek A. is already looking into it.
Sasha.
- [[Sphenix-production-l] ] Some failed runs., Alex Lebedev, 10/15/2024
Archive powered by MHonArc 2.6.24.