atlas-connect-l AT lists.bnl.gov
Subject: Atlas-connect-l mailing list
List archive
- From: Lincoln Bryant <lincolnb AT uchicago.edu>
- To: "Dr. Harinder Singh Bawa" <harinder.singh.bawa AT gmail.com>
- Cc: atlas-connect-l AT lists.bnl.gov
- Subject: Re: [Atlas-connect-l] 10K job test
- Date: Wed, 4 Dec 2013 18:02:49 -0600
Hi Harinder,
1. You can use this command to see where jobs are running on the ATLAS Connect login host:
condor_q -name login.atlas.ci-connect.net -pool uct2-bosco.mwt2.org:11120?sock=collector -run
If any jobs are running on your nodes, you should see that reflected in the HOST(S) column. Note that there is a difference between a condor_q against the ATLAS Connect login node and a condor_q against your local login node, because the HTCondor glideins run for ~30 minutes or so longer than the actual jobs.
Here's an example output:
$ condor_q -name login.atlas.ci-connect.net -pool uct2-bosco.mwt2.org:11120?sock=collector -run
-- Schedd: login.atlas.ci-connect.net : <128.135.158.156:56549?PrivAddr=%3c10.1.5.82:56549%3e&PrivNet=mwt2.org>
ID OWNER SUBMITTED RUN_TIME HOST(S)
15.63 ivukotic 12/4 12:41 0+03:11:34 2658 AT iut2-c073.iu.edu
15.65 ivukotic 12/4 12:41 0+03:11:34 2445 AT iut2-c086.iu.edu
15.75 ivukotic 12/4 12:41 0+03:11:34 30539 AT iut2-c080.iu.edu
15.79 ivukotic 12/4 12:41 0+03:11:32 22538 AT iut2-c056.iu.edu
15.81 ivukotic 12/4 12:41 0+03:11:32 10917 AT iut2-c085.iu.edu
15.82 ivukotic 12/4 12:41 0+03:11:32 23417 AT iut2-c095.iu.edu
15.83 ivukotic 12/4 12:41 0+03:11:32 28904 AT iut2-c092.iu.edu
15.84 ivukotic 12/4 12:41 0+03:11:32 16445 AT iut2-c086.iu.edu
15.85 ivukotic 12/4 12:41 0+03:11:32 17005 AT iut2-c121.iu.edu
15.87 ivukotic 12/4 12:41 0+03:11:32 20936 AT iut2-c105.iu.edu
15.502 ivukotic 12/4 12:41 0+02:45:29 32357 AT iut2-c114.iu.edu
15.599 ivukotic 12/4 12:41 0+02:38:26 5192 AT iut2-c090.iu.edu
15.600 ivukotic 12/4 12:41 0+02:38:26 6644 AT iut2-c044.iu.edu
15.601 ivukotic 12/4 12:41 0+02:38:25 6177 AT iut2-c046.iu.edu
15.604 ivukotic 12/4 12:41 0+02:38:26 20530 AT iut2-c054.iu.edu
15.605 ivukotic 12/4 12:41 0+02:38:26 8155 AT iut2-c117.iu.edu
15.609 ivukotic 12/4 12:41 0+02:38:26 11505 AT iut2-c106.iu.edu
15.629 ivukotic 12/4 12:41 0+02:38:03 28297 AT iut2-c098.iu.edu
15.630 ivukotic 12/4 12:41 0+02:38:03 15129 AT iut2-c118.iu.edu
2. We can check to see if the username of the submitter is in the job somewhere, otherwise we can inject this information into the job and write some documentation on how to retrieve it.
Hope that helps.
Cheers,
Lincoln
On Dec 4, 2013, at 5:24 PM, Dr. Harinder Singh Bawa wrote:
Hi Rob , lincoln,I have some queries, if you can answer would be appreciated. I am seeing 206 jobs being run on our Fresno T3 cluster under name "fresnoatlas" which is registered as connect client.3474.0 fresnoatlas 12/4 14:45 0+00:00:00 I 0 0.0 condor_exec.exe -d3475.0 fresnoatlas 12/4 14:45 0+00:00:00 I 0 0.0 condor_exec.exe -d3476.0 fresnoatlas 12/4 14:45 0+00:00:00 I 0 0.0 condor_exec.exe -d3477.0 fresnoatlas 12/4 14:46 0+00:00:00 I 0 0.0 condor_exec.exe -d3478.0 fresnoatlas 12/4 14:46 0+00:00:00 I 0 0.0 condor_exec.exe -d3479.0 fresnoatlas 12/4 14:46 0+00:00:00 I 0 0.0 condor_exec.exe -d3480.0 fresnoatlas 12/4 14:46 0+00:00:00 I 0 0.0 condor_exec.exe -d3481.0 fresnoatlas 12/4 14:46 0+00:00:00 I 0 0.0 condor_exec.exe -d3482.0 fresnoatlas 12/4 14:46 0+00:00:00 I 0 0.0 condor_exec.exe -d206 jobs; 30 idle, 176 running, 0 held*********************************************************************************This was the question I asked before: You submitted say 10k jobs from atlas connect.From our side:1. How do we see how many jobs are being allotted to Fresno T3. Using condor_q -global gives me Fresno Atlas got 206 jobs, but Is it all I need to look for?2. Since "fresnoatlas" is the account registered in connect client, If I understand its kind of route to Fresno T3. How do we know which user had their jobs running? Is there any monitoring/bookkeeping we can do from condor point of view.Harinder
On Wed, Dec 4, 2013 at 2:49 PM, Rob Gardner <rwg AT hep.uchicago.edu> wrote:Thanks Lincoln!On Dec 4, 2013, at 4:45 PM, Lincoln Bryant <lincolnb AT uchicago.edu> wrote:On it -- we need to install the extra packages from the other Connect sites.--LincolnOn Dec 4, 2013, at 4:44 PM, Rob Gardner wrote:As well as the “distribution” command._______________________________________________On Dec 4, 2013, at 4:36 PM, Rob Gardner <rwg AT hep.uchicago.edu> wrote:Just a heads up — I’ve submitted 10k jobs (each just sleeps 5 minutes) from login.usatlas.org.
Also, Lincoln, the historygram command is not installed.---Rob Gardner • Skype rwg773 • 312-804-0859 • University of Chicago---Rob Gardner • Skype rwg773 • 312-804-0859 • University of Chicago
Atlas-connect-l mailing list
Atlas-connect-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/atlas-connect-l
---Rob Gardner • Skype rwg773 • 312-804-0859 • University of Chicago
_______________________________________________
Atlas-connect-l mailing list
Atlas-connect-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/atlas-connect-l
--
-
[Atlas-connect-l] 10K job test,
Rob Gardner, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Rob Gardner, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Lincoln Bryant, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Rob Gardner, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Dr. Harinder Singh Bawa, 12/04/2013
- Re: [Atlas-connect-l] 10K job test, Lincoln Bryant, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Dr. Harinder Singh Bawa, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Rob Gardner, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Lincoln Bryant, 12/04/2013
-
Re: [Atlas-connect-l] 10K job test,
Rob Gardner, 12/04/2013
Archive powered by MHonArc 2.6.24.