Skip to Content.
Sympa Menu

sphenix-software-l - [Sphenix-software-l] condor logs (please everyone running condor jobs - read)

sphenix-software-l AT lists.bnl.gov

Subject: sPHENIX discussion of software

List archive

Chronological Thread  
  • From: pinkenburg <pinkenburg AT bnl.gov>
  • To: phenix-off-l <phenix-off-l AT lists.bnl.gov>, "sphenix-software-l AT lists.bnl.gov" <sphenix-software-l AT lists.bnl.gov>
  • Subject: [Sphenix-software-l] condor logs (please everyone running condor jobs - read)
  • Date: Wed, 2 Oct 2019 13:10:43 -0400

Hi folks,

this is about the logs condor writes (the LogĀ  = in your jobfile), not your job log (Output =) or error log (Error =).

writing the condor logs to gpfs slows gpfs down drastically up to the point where it becomes unresponsive and jobs don't even start because condor cannot create this file. It's a ton of short writes and gpfs is not made for this and with users now being able to run thousands of jobs at once it's really having an impact.

Normally you don't care about the content of this file anyway, it contains stuff like:

000 (1297725.000.000) 10/02 13:02:22 Job submitted from host: <130.199.48.83:9618?addrs=130.199.48.83-9618&noUDP&sock=8867_649e_3>
...
001 (1297725.000.000) 10/02 13:02:25 Job executing on host: <130.199.157.87:9618?addrs=130.199.157.87-9618&noUDP&sock=21355_aff9_3>

which is helpful for debugging condor.

Please tell condor in your job file to open this file on the local /tmp directory. So we do not have condor overwrite each others log files please create a directory under tmp with your username:

mkdir /tmp/<your rcf name>

and tell condor to write the logfile there:

Log = /tmp/<your rcf name>/<whatever name you give it>


As an additional benefit - /tmp gets cleaned up by a cron job of old files so you don't even have to delete them yourself. If you need to look at them, you just have to be on the machine you are running the job from.

It's not a big change so please do it asap and we'll all benefit immediately.

Thanks
Chris



...

--
*************************************************************

Christopher H. Pinkenburg ; pinkenburg AT bnl.gov
; http://www.phenix.bnl.gov/~pinkenbu

Brookhaven National Laboratory ; phone: (631) 344-5692
Physics Department Bldg 510 C ; fax: (631) 344-3253
Upton, NY 11973-5000

*************************************************************




  • [Sphenix-software-l] condor logs (please everyone running condor jobs - read), pinkenburg, 10/02/2019

Archive powered by MHonArc 2.6.24.

Top of Page