sphenix-tracking-l AT lists.bnl.gov
Subject: sPHENIX tracking discussion
List archive
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...
- From: Christof Roland <christof.roland AT cern.ch>
- To: Christof Roland <christof.roland AT cern.ch>
- Cc: sphenix-tracking-l AT lists.bnl.gov
- Subject: Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...
- Date: Thu, 4 Nov 2021 12:18:57 +0100
Ok, we have the culprit.
In PHActsSiliconSeeding most of the time is spent in trying to match the mvtx stubs to the INTT hits
in the function findInttMatches. See timer output:
Circle fit time 10.8842 and svtx time 10.8937
taubintime: 0.0002 roottime: 0.00119 linetime: 0.00127 intttime: 10.8613
Of 10.893 sec fit time for one mvtx seed we spend 10.861 sec in the Intt matching.
In the matching each mvtx stub is compared to all hits in the INTT including numerous calculations
being made before the decision is made if the hits fit to the stub or not.
I think this routing should be sped up by rejecting the hits or even entire hit sets as early as possible.
Cheers
Christof
On 3. Nov 2021, at 17:32, Christof Roland via sPHENIX-tracking-l <sphenix-tracking-l AT lists.bnl.gov> wrote:Joe Osborn, Ph.D.Hi Chris, Joe,i tried to run an event that run suspiciously long through callgrind,but the graph I get from the output.dot file is corrupted. You can display it as isand as soon as you zoom in to make sense of the connections the figure erases itself.Highly annoying.A few timers and print statements give a little more insight on the PHActsSiliconSeeding at least:Finished PHActsSiliconSeeding process_eventPHActsSiliconSeeding Acts seed time 15.8078PHActsSiliconSeeding circle fit time 1186.91PHActsSiliconSeeding total event time 1202.72the circle fit runs ~110 fits in ~10ms per fit approximately. 10ms for a circle fit is huge.A full ACTS fit is supposed to be done in a fraction of a millisecond.So smething seems to get stuck in the clrcle fitting.Also the distribution of processing time is not what it used to be. It has a low most probable valueand massive outliers, see figure below.PHActsSiliconSeeding Time per event in seconds:<SiSeedingTime.jpg>JoeOn 3. Nov 2021, at 15:18, pinkenburg via sPHENIX-tracking-l <sphenix-tracking-l AT lists.bnl.gov> wrote:Hi,There is some improvement that can be made to the local coordinates storage code by e.g. storing the clusters as a map of <surfaceID, cluster> so that we don’t have to do the surface lookup each time, but I don’t think this would cause such a significant slow down.
in the meantime I got at least some of the MB sims ready for tracking, the logs with the timer printouts at the end are
/sphenix/sim/sim01/sphnxpro/MDC2/sHijing_HepMC/fm_0_20/Tracks/log/condor-0000000002-*.out
It uses the mdc2.2 build which is fairly recent.
By far the biggest contributor is PHActsSiliconSeeding which takes many minutes. You can filter the time per event via this:
cat /sphenix/sim/sim01/sphnxpro/MDC2/sHijing_HepMC/fm_0_20/Tracks/log/condor-0000000002-*.out | grep 'PHActsSiliconSeeding_TOP: per event time'
Chris
On 11/2/2021 10:38 AM, Christof Roland via sPHENIX-tracking-l wrote:
Hi,
the effect is big enough so we should be able to figure it out with the profileror some timers in the code. let me have a look.
cheers
ChristofBefore I went on family leave, I was wondering if the local coordinates PR was the cause of this also. However, I had run jobs with a previous ana build (before this PR was merged) and found the same results as after the PR was merged.
On 2. Nov 2021, at 15:33, Osborn, Joe <osbornjd AT ornl.gov> wrote:
Hi Christof,---------------------------
Date: Tuesday, November 2, 2021 at 9:42 AM
To: Anthony Frawley via sPHENIX-tracking-l <sphenix-tracking-l AT lists.bnl.gov>
Subject: [EXTERNAL] [Sphenix-tracking-l] We have a timing problem...
Hi Everybody,
following up on the discussion on timeing performance of our current code
I ran a few benchmark jobs. 1000 jobs one event each hijing 0-20 + 50kHz of pileup.
These are jobs submitted from sphnx02 so I assume even if there are a few
slow machines this will not dominate the results.
Time per event is here:
InttClusterizer 1000 0.014891 sec
MvtxClusterizer 1000 0.088210 sec
TpcClusterizer 1000 2.677374 sec
TpcClusterCleaner 1000 0.045161 sec
PHActsSiliconSeeding 1000 509.086792 sec
PHActsVertexPropagator 1000 0.044235 sec
PHCASeeding 1000 3.974943 sec
PHSimpleKFProp 1000 3.371190 sec
PrePropagatorPHTpcTrackSeedCircleFit 1000 0.108033 sec
PHTpcTrackSeedCircleFit 1000 0.107930 sec
PHTrackCleaner 1000 0.007340 sec
PHGhostRejection 1000 0.211178 sec
PHSiliconTpcTrackMatching 1000 1.440864 sec
PHActsFirstTrkFitter 1000 2.409158 sec
PHSimpleVertexFinder 1000 0.044909 sec
PHRaveVertexing 1000 0.409105 sec
PHGenFitTrackProjection 1000 0.000386 sec
SvtxEvaluator 1000 418.793579 sec
It looks like all modules touching the actual hits are horribly slow now.
This looks like in our recent changes to the local coordinate storage we introduced
an inefficient loop to look up things.
I'll try to run a job through callgrind now to see if I can trace this down some more.
cheers
Christof
_______________________________________________
sPHENIX-tracking-l mailing list
sPHENIX-tracking-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/sphenix-tracking-l
following up on the discussion on timeing performance of our current code
I ran a few benchmark jobs. 1000 jobs one event each hijing 0-20 + 50kHz of pileup.
These are jobs submitted from sphnx02 so I assume even if there are a few
slow machines this will not dominate the results.
Time per event is here:
InttClusterizer 1000 0.014891 sec
MvtxClusterizer 1000 0.088210 sec
TpcClusterizer 1000 2.677374 sec
TpcClusterCleaner 1000 0.045161 sec
PHActsSiliconSeeding 1000 509.086792 sec
PHActsVertexPropagator 1000 0.044235 sec
PHCASeeding 1000 3.974943 sec
PHSimpleKFProp 1000 3.371190 sec
PrePropagatorPHTpcTrackSeedCircleFit 1000 0.108033 sec
PHTpcTrackSeedCircleFit 1000 0.107930 sec
PHTrackCleaner 1000 0.007340 sec
PHGhostRejection 1000 0.211178 sec
PHSiliconTpcTrackMatching 1000 1.440864 sec
PHActsFirstTrkFitter 1000 2.409158 sec
PHSimpleVertexFinder 1000 0.044909 sec
PHRaveVertexing 1000 0.409105 sec
PHGenFitTrackProjection 1000 0.000386 sec
SvtxEvaluator 1000 418.793579 sec
It looks like all modules touching the actual hits are horribly slow now.
This looks like in our recent changes to the local coordinate storage we introduced
an inefficient loop to look up things.
I'll try to run a job through callgrind now to see if I can trace this down some more.
cheers
Christof
_______________________________________________
sPHENIX-tracking-l mailing list
sPHENIX-tracking-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/sphenix-tracking-l
_______________________________________________ sPHENIX-tracking-l mailing list sPHENIX-tracking-l AT lists.bnl.gov https://lists.bnl.gov/mailman/listinfo/sphenix-tracking-l
-- ************************************************************* Christopher H. Pinkenburg ; pinkenburg AT bnl.gov ; http://www.phenix.bnl.gov/~pinkenbu Brookhaven National Laboratory ; phone: (631) 344-5692 Physics Department Bldg 510 C ; fax: (631) 344-3253 Upton, NY 11973-5000 *************************************************************_______________________________________________
sPHENIX-tracking-l mailing list
sPHENIX-tracking-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/sphenix-tracking-l
_______________________________________________
sPHENIX-tracking-l mailing list
sPHENIX-tracking-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/sphenix-tracking-l
-
[Sphenix-tracking-l] We have a timing problem...,
Christof Roland, 11/02/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Osborn, Joe, 11/02/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Christof Roland, 11/02/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
pinkenburg, 11/03/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Christof Roland, 11/03/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Christof Roland, 11/04/2021
- Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem..., Osborn, Joe, 11/04/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Christof Roland, 11/04/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Christof Roland, 11/03/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
pinkenburg, 11/03/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Christof Roland, 11/02/2021
-
Re: [Sphenix-tracking-l] [EXTERNAL] We have a timing problem...,
Osborn, Joe, 11/02/2021
Archive powered by MHonArc 2.6.24.