Skip to Content.
Sympa Menu

sphenix-software-l - [[Sphenix-software-l] ] please check your memory requests for your condor jobs

sphenix-software-l AT lists.bnl.gov

Subject: sPHENIX discussion of software

List archive

Chronological Thread  
  • From: pinkenburg <pinkenburg AT bnl.gov>
  • To: "sphenix-software-l AT lists.bnl.gov" <sphenix-software-l AT lists.bnl.gov>
  • Subject: [[Sphenix-software-l] ] please check your memory requests for your condor jobs
  • Date: Sun, 2 Mar 2025 12:54:07 -0500

Hi folks,

we have around 60k condor slots but run only 20k jobs. The reason for this are jobs requesting more memory than the 4GB we have per core. Sometimes this is necessary (the sim jobs need 10GB) but condor goes with what you request when allocating resources, not with what your job actually needs (means if you request 20GB and your job uses 2GB, condor will allocate 20GB).

As of late last week I think we have the random evictions and weird memory accounting by condor (where high i/o was counted as memory) sorted out. The current setting is that your job can exceed the requested memory by 20% before it gets evicted. Additionally the memory usage condor prints out every so often into the condor log seems fairly accurate.

I went through a few running jobs and found that they request (sometimes way) too much memory which is a large contributor to our idle cores.

Right now this really jeopardizes the sims for ppg02

Please be mindful and have a look if your memory requests are backed by your actual usage

Chris


--
*************************************************************

Christopher H. Pinkenburg ; pinkenburg AT bnl.gov
; http://www.phenix.bnl.gov/~pinkenbu

Brookhaven National Laboratory ; phone: (631) 344-5692
Physics Department Bldg 510 C ; fax: (631) 344-3253
Upton, NY 11973-5000

*************************************************************



  • [[Sphenix-software-l] ] please check your memory requests for your condor jobs, pinkenburg, 03/02/2025

Archive powered by MHonArc 2.6.24.

Top of Page