sphenix-software-l AT lists.bnl.gov
Subject: sPHENIX discussion of software
List archive
- From: pinkenburg <pinkenburg AT bnl.gov>
- To: "sphenix-software-l AT lists.bnl.gov" <sphenix-software-l AT lists.bnl.gov>
- Subject: [Sphenix-software-l] condor notification settings
- Date: Fri, 2 Sep 2022 08:19:57 -0400
Hi folks,
a lot of the condor job files I see are completely outdated - going back to examples which floated around 10-15 years ago. Settings like
+Job_Type = "cas"
+Experiment = "phenix"
Requirements = CPU_Experiment == "phenix"
are obsolete (job_type, experiment) or counter productive (these Requirements will restrict you to 2 nodes, good luck running a lot of jobs there). The really problematic one is the
# Email address to send notification to.
###Notify_user = me AT somemailserver.com
which was promoted in examples from 10+ years ago. This will result in an email send to that address - normally in cases of errors. Now you submit 10k condor jobs which get evicted for some reason which results in 10k emails in your inbox. What caused problems yesterday was that the job file used an invalid email address on a valid domain which then bounces, sending 10k emails back to BNL (at which point it was noticed since it looked like a DOS attack).
A farm is to first order an error multiplier and with the resources available to you in sdcc you multiply every problem by 4 orders of magnitude. So - please have a look at your job files and at least remove the notification settings and the notify_user. We keep a minimal condor job file example in our wiki:
https://wiki.sphenix.bnl.gov/index.php/Condor#simple_condor_job_file
If you use one of these old examples - the best you can do is remove it (all those ranking statements won't do you any good either) and start fresh with this one. In here you will also find a nice improvement of the queue command which makes passing parameters to your job so much easier then the $(Process).
Thanks,
Chris
--
*************************************************************
Christopher H. Pinkenburg ; pinkenburg AT bnl.gov
; http://www.phenix.bnl.gov/~pinkenbu
Brookhaven National Laboratory ; phone: (631) 344-5692
Physics Department Bldg 510 C ; fax: (631) 344-3253
Upton, NY 11973-5000
*************************************************************
- [Sphenix-software-l] condor notification settings, pinkenburg, 09/02/2022
Archive powered by MHonArc 2.6.24.