Skip to Content.
Sympa Menu

sphenix-software-l - Re: [Sphenix-software-l] mdc files

sphenix-software-l AT lists.bnl.gov

Subject: sPHENIX discussion of software

List archive

Chronological Thread  
  • From: Dmitri Smirnov <dmixsmi AT gmail.com>
  • To: sphenix-software-l AT lists.bnl.gov
  • Subject: Re: [Sphenix-software-l] mdc files
  • Date: Tue, 8 Dec 2020 11:39:20 -0500

Hi Chris,

In STAR we use TClonesArray with simple (mostly flat) data structures saved in TTrees. I would expect TClonesArray to be more optimized for ROOT IO than std::map. Actually, your study (https://github.com/pinkenburg/rootmemory) seems to be consistent with that assumption. Also, for persistent data we use a limited precision data types in ROOT such as Double32_t. These do not help to reduce the RSS but can scale down the output files.


On 12/4/20 8:20 PM, pinkenburg wrote:

Hi Dmitri,

these properties are kept in a map and only created if they are actually filled in the stepping action (e.g. path_length for a hit in a calorimeter shower is pointless). If you access a property which does not exist it returns the default. In coresoftware/simulation/g4simulation/g4main/PHG4Hitv1.cc the PHG4Hitv1::print() const method has a loop which extracts only existing variables.

These property maps while convenient to work with sadly are likely a large contributor to roots memory consumption. If there is an idea how to stream this in a way which is more root friendly - I'd love to hear more about it.

I think this came up recently - Jan Bernauer worked out an example replacing root i/o by flat buffers:
https://urldefense.com/v3/__https://google.github.io/flatbuffers/__;!!P4SdNyxKAPE!WYCxjav9PRsXYtnxqKddBKNu66dwCRM55ZvmjnPhib2J68VcGMnJZZWwcKVueteRJ9QFkJf3vL83HYk$ If we can pull this off I would hope that this will reduce our memory problems significantly.

Chris


On 12/4/2020 6:59 PM, Dmitri Smirnov wrote:
Not sure if it is a know issue but I noticed that some properties of PHG4Hit are not filled in files from /sphenix/data/data02/sphnxpro/MDC1/sHijing_HepMC/G4Hits/data
For example, path_length, layer, row, and hit_type are filled with default values.


On 11/30/20 6:31 PM, pinkenburg wrote:
Hi folks,

The Geant4 processing has been ongoing and we have close to 500,000 events from the initial geant4 processing. The reconstruction chain needs some work but is basically functional up to jets. I split it into six separate steps (https://github.com/sPHENIX-Collaboration/MDC1/tree/main/submit):

pass1:
Geant4 simulation of sHijing, output (G4Hits of all active detectors) is written to
/sphenix/data/data02/sphnxpro/MDC1/sHijing_HepMC/G4Hits/data
Each file contains 100 events

pass2:
50kHz pileup generation, output is split into 4 files
DST_BBC_G4HIT_sHijing_0_12fm contains bbc and epd G4Hits
DST_CALO_G4HIT_sHijing_0_12fm contains all calorimeter G4Hits
DST_TRKR_G4HIT_sHijing_0_12fm contains tracking detector G4Hits
DST_TRUTH_G4HIT_sHijing_0_12fm contains the truth info (and HepMC records and black hole hits)
the output is written to:
/sphenix/data/data02/sphnxpro/MDC1/sHijing_HepMC/PileUp/data
Each file contains about 50 events.


pass3calo
processing of calorimeter data from the DST_CALO_GHITS files, towers (all flavors) and clusters are saved. I tried the topo clusters but it ran out of memory (I killed it as it reached 40GB).
the output is written to
/sphenix/data/data02/sphnxpro/MDC1/sHijing_HepMC/CaloCluster/data

 pass3trk:
tracking pre-pass: tpc electron drift and clustering, silicon g4hit clustering and silicon seeds from truth info using DST_TRKR_G4HIT and DST_TRKR_G4HIT files. Saves clusters for the tracking
the output is written to
/sphenix/data/data02/sphnxpro/MDC1/sHijing_HepMC/TrkrCluster/data

pass4trk:
tracking and vertex reconstruction with ACTS (also rave vertex)
the output goes to
/sphenix/data/data02/sphnxpro/MDC1/sHijing_HepMC/Tracks/data

pass5jet:
jet reconstruction based on tracks (clusters need a vertex)
the output goes to
/sphenix/data/data02/sphnxpro/MDC1/sHijing_HepMC/Jets/data


all files contain the data needed for the synchronization so one can mix and match whatever is needed. There is a decent amount of calorimeter files, tracking (and jets) are single files used to establish that this works. But they can certainly be used to check if they work as input for analysis.

Chris


*************************************************************

Christopher H. Pinkenburg    ;    pinkenburg AT bnl.gov
                ;    http://www.phenix.bnl.gov/~pinkenbu

Brookhaven National Laboratory    ;    phone: (631) 344-5692
Physics Department Bldg 510 C    ;    fax:   (631) 344-3253
Upton, NY 11973-5000

*************************************************************

_______________________________________________
sPHENIX-software-l mailing list
sPHENIX-software-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/sphenix-software-l


--
Dmitri





Archive powered by MHonArc 2.6.24.

Top of Page