atlas-connect-l AT lists.bnl.gov
Subject: Atlas-connect-l mailing list
List archive
Re: [Atlas-connect-l] ATLAS Connect meeting Monday
- From: David Lesny <ddl AT illinois.edu>
- To: "Dr. Harinder Singh Bawa" <harinder.singh.bawa AT gmail.com>, Rob Gardner <rwg AT hep.uchicago.edu>
- Cc: atlas-connect-l <atlas-connect-l AT lists.bnl.gov>
- Subject: Re: [Atlas-connect-l] ATLAS Connect meeting Monday
- Date: Mon, 18 Aug 2014 12:35:33 -0500
I saw the jobs in hold on the your end and cleaned them out I submitted a bunch of jobs from my end to your site and they seemed to run without a problem. Could there have been some type of transient problem at your end. Perhaps a worker node with a bad NFS mount of the home areas? Right now I have about 45 jobs running on your system dave On 8/18/2014 12:23 PM, Dr. Harinder
Singh Bawa wrote:
Hi Dave,
As discussed, I see a lots of jobs going on held state
since last 3-4 weeks. Error Code 14 indicates condor cannot
access the initial working directory for the job.
See for example:
condor_q -global -l|grep HoldReason [bawa@t3nfs ~]$ condor_q -global 104266.0 -l |grep HoldReason HoldReasonSubCode = 2 HoldReason = "Cannot access initial working directory /nfs/t3nfs_common/home/fresnoatlas/bosco/rccf-atlas.ci-connect.net/fresnostate/sandbox/a6c3/a6c34aed/rccf-atlas.ci-connect.net_11018_rccf-atlas.ci-connect.net#40710.0#1408333977: No such file or directory" HoldReasonCode = 14 Here are the list of jobs currently on held state:
[bawa@t3nfs ~]$ condor_q -global -- Schedd: t3head.atlas.csufresno.edu : <129.8.242.180:9840?CCBID=129.8.242.180:9618%3fPrivNet%3datlas.csufresno.edu#234908&PrivAddr=%3c192.168.100.1:9840%3e&PrivNet=atlas.csufresno.edu> ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 103390.0 fresnoatlas 8/14 15:42 0+00:00:00 H 0 0.0 condor_exec.exe -d 103391.0 fresnoatlas 8/14 15:42 0+00:00:01 H 0 0.0 condor_exec.exe -d 103394.0 fresnoatlas 8/14 15:42 0+00:00:01 H 0 0.0 condor_exec.exe -d 103420.0 fresnoatlas 8/14 16:17 0+00:00:04 H 0 0.0 condor_exec.exe -d 103423.0 fresnoatlas 8/14 16:17 0+00:00:00 H 0 0.0 condor_exec.exe -d 103424.0 fresnoatlas 8/14 16:17 0+00:00:03 H 0 0.0 condor_exec.exe -d 103433.0 fresnoatlas 8/14 16:17 0+00:00:01 H 0 0.0 condor_exec.exe -d 103538.0 fresnoatlas 8/14 18:08 0+00:00:08 H 0 0.0 condor_exec.exe -d 103539.0 fresnoatlas 8/14 18:08 0+00:00:01 H 0 0.0 condor_exec.exe -d 103540.0 fresnoatlas 8/14 18:08 0+00:00:02 H 0 0.0 condor_exec.exe -d 103541.0 fresnoatlas 8/14 18:08 0+00:00:04 H 0 0.0 condor_exec.exe -d 103542.0 fresnoatlas 8/14 18:08 0+00:00:02 H 0 0.0 condor_exec.exe -d 103553.0 fresnoatlas 8/14 18:09 0+00:00:20 H 0 0.0 condor_exec.exe -d 103554.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103555.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103556.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103557.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103558.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103559.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103560.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103561.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103562.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103563.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103564.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103565.0 fresnoatlas 8/14 18:09 0+00:00:00 H 0 0.0 condor_exec.exe -d 103575.0 fresnoatlas 8/14 19:12 0+00:00:00 H 0 0.0 condor_exec.exe -d 103576.0 fresnoatlas 8/14 19:12 0+00:00:00 H 0 0.0 condor_exec.exe -d 103577.0 fresnoatlas 8/14 19:12 0+00:00:00 H 0 0.0 condor_exec.exe -d 103578.0 fresnoatlas 8/14 19:12 0+00:00:00 H 0 0.0 condor_exec.exe -d 103579.0 fresnoatlas 8/14 19:12 0+00:00:00 H 0 0.0 condor_exec.exe -d 103580.0 fresnoatlas 8/14 19:13 0+00:00:00 H 0 0.0 condor_exec.exe -d 103607.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103608.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103609.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103610.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103611.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103612.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103613.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103614.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103615.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103616.0 fresnoatlas 8/14 19:36 0+00:00:00 H 0 0.0 condor_exec.exe -d 103699.0 fresnoatlas 8/14 22:03 0+00:00:00 H 0 0.0 condor_exec.exe -d 103709.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103710.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103711.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103712.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103713.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103714.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103715.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103716.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103717.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103718.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103719.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103720.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103721.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103722.0 fresnoatlas 8/14 22:05 0+00:00:00 H 0 0.0 condor_exec.exe -d 103723.0 fresnoatlas 8/14 22:06 0+00:00:00 H 0 0.0 condor_exec.exe -d 103874.0 fresnoatlas 8/15 03:35 0+00:00:00 H 0 0.0 condor_exec.exe -d 103875.0 fresnoatlas 8/15 03:35 0+00:00:00 H 0 0.0 condor_exec.exe -d 103876.0 fresnoatlas 8/15 03:35 0+00:00:00 H 0 0.0 condor_exec.exe -d 103877.0 fresnoatlas 8/15 03:35 0+00:00:00 H 0 0.0 condor_exec.exe -d 104265.0 fresnoatlas 8/17 20:53 0+00:00:00 H 0 0.0 condor_exec.exe -d 104266.0 fresnoatlas 8/17 20:53 0+00:00:00 H 0 0.0 condor_exec.exe -d 104267.0 fresnoatlas 8/17 20:53 0+00:00:00 H 0 0.0 condor_exec.exe -d 104268.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104269.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104270.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104271.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104272.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104273.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104274.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104275.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104276.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104277.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104278.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104279.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104280.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104281.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104282.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104283.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104284.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 104285.0 fresnoatlas 8/17 20:54 0+00:00:00 H 0 0.0 condor_exec.exe -d 82 jobs; 0 idle, 0 running, 82 held Let me know if I can further debug or
if you have any suggestion to look for.
Thanks Harinder
On Fri, Aug 15, 2014 at 2:06 PM, Rob
Gardner <rwg AT hep.uchicago.edu>
wrote:
Folks,
We haven’t had a meeting in a while. Lets synch up
this Monday, which will be good to do in advance of
the LBNL meeting.
Monday,
18 August 2014 from 11:30 to 12:30 (US/Central)
https://indico.cern.ch/event/335658/
Agenda (for discussion):
- I
will post some slides for the LBNL meeting that we can
review.
- The
portableCVMFS solution
- The
replicated Stratum 1 solution
-
Status of unit tests (Jenkins)
- Any
updates to the tutorials or github coming?
Thanks
Rob
_______________________________________________ Atlas-connect-l mailing list Atlas-connect-l AT lists.bnl.gov https://lists.bnl.gov/mailman/listinfo/atlas-connect-l -- Dr. Harinder Singh Bawa
[web][facebook][youtube][twitter] _______________________________________________ Atlas-connect-l mailing list Atlas-connect-l AT lists.bnl.gov https://lists.bnl.gov/mailman/listinfo/atlas-connect-l --
David Lesny David Lesny Senior Research Physicist (Retired) High Energy Physics Office: 217-333-4972 | Fax: 217-333-4990 Skype: ddlesny | mwt2-ddlesny |
-
[Atlas-connect-l] ATLAS Connect meeting Monday,
Rob Gardner, 08/15/2014
-
Re: [Atlas-connect-l] ATLAS Connect meeting Monday,
Dr. Harinder Singh Bawa, 08/18/2014
-
Re: [Atlas-connect-l] ATLAS Connect meeting Monday,
David Lesny, 08/18/2014
- Re: [Atlas-connect-l] ATLAS Connect meeting Monday, Dr. Harinder Singh Bawa, 08/18/2014
-
Re: [Atlas-connect-l] ATLAS Connect meeting Monday,
David Lesny, 08/18/2014
-
Re: [Atlas-connect-l] ATLAS Connect meeting Monday,
Dr. Harinder Singh Bawa, 08/18/2014
Archive powered by MHonArc 2.6.24.