sphenix-tracking-l AT lists.bnl.gov
Subject: sPHENIX tracking discussion
List archive
Re: [Sphenix-tracking-l] running job A on 2GB/core
- From: Anthony Frawley <afrawley AT fsu.edu>
- To: sphenix-tracking <sphenix-tracking-l AT lists.bnl.gov>, pinkenburg <pinkenburg AT bnl.gov>
- Subject: Re: [Sphenix-tracking-l] running job A on 2GB/core
- Date: Fri, 10 Jun 2022 19:57:03 +0000
Hi Chris,
Thanks, this is good news. It shows that we will have flexibility in the machines we use for all of our job types, including the most time consuming one (Job A).
Tony
From: sPHENIX-tracking-l <sphenix-tracking-l-bounces AT lists.bnl.gov> on behalf of pinkenburg via sPHENIX-tracking-l <sphenix-tracking-l AT lists.bnl.gov>
Sent: Friday, June 10, 2022 2:47 PM
To: sphenix-tracking <sphenix-tracking-l AT lists.bnl.gov>
Subject: [Sphenix-tracking-l] running job A on 2GB/core
Sent: Friday, June 10, 2022 2:47 PM
To: sphenix-tracking <sphenix-tracking-l AT lists.bnl.gov>
Subject: [Sphenix-tracking-l] running job A on 2GB/core
Hi folks,
I got the 2GB/core test node. It's an older machine with 24 logical
cores. Our current job A's sit comfortably between 1.5 and 1.7GB, once
the copying of the input files is done, the swap daemon goes back to
sleep and the jobs make full use of the cpu:
29130 sphnxpro 20 0 4812960 1.6g 11916 R 100.0 3.4 7:15.86 root.exe
29192 sphnxpro 20 0 4744488 1.5g 8860 R 100.0 3.3 7:11.88 root.exe
29286 sphnxpro 20 0 4814684 1.6g 6960 R 100.0 3.4 6:40.13 root.exe
29301 sphnxpro 20 0 4805920 1.5g 8680 R 100.0 3.3 6:39.17 root.exe
29402 sphnxpro 20 0 4717164 1.5g 13112 R 100.0 3.2 6:24.74 root.exe
29598 sphnxpro 20 0 4709088 1.5g 32476 R 100.0 3.3 5:19.09 root.exe
29632 sphnxpro 20 0 4774348 1.5g 30968 R 100.0 3.2 5:16.41 root.exe
29871 sphnxpro 20 0 4763380 1.5g 32488 R 100.0 3.3 5:13.16 root.exe
29896 sphnxpro 20 0 4868604 1.7g 33208 R 100.0 3.6 5:16.76 root.exe
30113 sphnxpro 20 0 4801508 1.6g 59212 R 100.0 3.5 4:01.72 root.exe
30218 sphnxpro 20 0 4647632 1.5g 60536 R 100.0 3.2 4:03.74 root.exe
30328 sphnxpro 20 0 4733160 1.6g 60520 R 100.0 3.3 4:06.29 root.exe
24952 sphnxpro 20 0 4815524 1.3g 6944 R 99.7 2.8 15:01.31 root.exe
28995 sphnxpro 20 0 4764232 1.5g 13212 R 99.7 3.3 7:25.74 root.exe
29574 sphnxpro 20 0 4772152 1.5g 32632 R 99.7 3.2 5:15.08 root.exe
30264 sphnxpro 20 0 4774560 1.6g 60756 R 99.3 3.4 4:02.89 root.exe
30860 sphnxpro 20 0 4467088 1.3g 94504 R 99.3 2.9 2:09.04 root.exe
29171 sphnxpro 20 0 4785864 1.6g 6960 R 99.0 3.4 7:07.58 root.exe
29769 sphnxpro 20 0 4682344 1.5g 30956 R 98.7 3.1 5:11.67 root.exe
30404 sphnxpro 20 0 4762580 1.6g 60776 R 98.7 3.5 4:03.98 root.exe
29509 sphnxpro 20 0 4878296 1.7g 33248 R 95.7 3.6 5:19.22 root.exe
30262 sphnxpro 20 0 4773848 1.6g 60136 R 93.4 3.4 4:01.12 root.exe
29542 sphnxpro 20 0 4665276 1.4g 33316 R 87.4 3.1 5:18.91 root.exe
30379 sphnxpro 20 0 4792176 1.6g 59192 R 83.1 3.5 4:02.14 root.exe
All older nodes have regular hard disks, so the reading takes time which
leads to <100% cpu for some jobs. I let it run for 10k jobs to see how
this holds up.
Not sure if this is visible for everybody, if you want to see how this
is doing, the grafana page for spool 0346 is:
https://urldefense.com/v3/__https://monitoring.sdcc.bnl.gov/grafana/d/000000026/linux-farm-collectd?orgId=1&refresh=1m&var-Experiment=spool&var-Hostname=spool0346_sdcc_bnl_gov&var-Interface=eth1&from=now-2d&to=now__;!!PhOWcWs!2wAZu_j0vW7oSLYgm_JQABoz3NdT-J2Azxqu5lxKAaunpaJXn46zA_nFlnk-dINF46rZehMSCB4Dk889K95afOS7b2yuRCfrug$
Chris
--
*************************************************************
Christopher H. Pinkenburg ; pinkenburg AT bnl.gov
; https://urldefense.com/v3/__http://www.phenix.bnl.gov/*pinkenbu__;fg!!PhOWcWs!2wAZu_j0vW7oSLYgm_JQABoz3NdT-J2Azxqu5lxKAaunpaJXn46zA_nFlnk-dINF46rZehMSCB4Dk889K95afOS7b2y7jFrZPA$
Brookhaven National Laboratory ; phone: (631) 344-5692
Physics Department Bldg 510 C ; fax: (631) 344-3253
Upton, NY 11973-5000
*************************************************************
_______________________________________________
sPHENIX-tracking-l mailing list
sPHENIX-tracking-l AT lists.bnl.gov
https://urldefense.com/v3/__https://lists.bnl.gov/mailman/listinfo/sphenix-tracking-l__;!!PhOWcWs!2wAZu_j0vW7oSLYgm_JQABoz3NdT-J2Azxqu5lxKAaunpaJXn46zA_nFlnk-dINF46rZehMSCB4Dk889K95afOS7b2y4sPbh-A$
I got the 2GB/core test node. It's an older machine with 24 logical
cores. Our current job A's sit comfortably between 1.5 and 1.7GB, once
the copying of the input files is done, the swap daemon goes back to
sleep and the jobs make full use of the cpu:
29130 sphnxpro 20 0 4812960 1.6g 11916 R 100.0 3.4 7:15.86 root.exe
29192 sphnxpro 20 0 4744488 1.5g 8860 R 100.0 3.3 7:11.88 root.exe
29286 sphnxpro 20 0 4814684 1.6g 6960 R 100.0 3.4 6:40.13 root.exe
29301 sphnxpro 20 0 4805920 1.5g 8680 R 100.0 3.3 6:39.17 root.exe
29402 sphnxpro 20 0 4717164 1.5g 13112 R 100.0 3.2 6:24.74 root.exe
29598 sphnxpro 20 0 4709088 1.5g 32476 R 100.0 3.3 5:19.09 root.exe
29632 sphnxpro 20 0 4774348 1.5g 30968 R 100.0 3.2 5:16.41 root.exe
29871 sphnxpro 20 0 4763380 1.5g 32488 R 100.0 3.3 5:13.16 root.exe
29896 sphnxpro 20 0 4868604 1.7g 33208 R 100.0 3.6 5:16.76 root.exe
30113 sphnxpro 20 0 4801508 1.6g 59212 R 100.0 3.5 4:01.72 root.exe
30218 sphnxpro 20 0 4647632 1.5g 60536 R 100.0 3.2 4:03.74 root.exe
30328 sphnxpro 20 0 4733160 1.6g 60520 R 100.0 3.3 4:06.29 root.exe
24952 sphnxpro 20 0 4815524 1.3g 6944 R 99.7 2.8 15:01.31 root.exe
28995 sphnxpro 20 0 4764232 1.5g 13212 R 99.7 3.3 7:25.74 root.exe
29574 sphnxpro 20 0 4772152 1.5g 32632 R 99.7 3.2 5:15.08 root.exe
30264 sphnxpro 20 0 4774560 1.6g 60756 R 99.3 3.4 4:02.89 root.exe
30860 sphnxpro 20 0 4467088 1.3g 94504 R 99.3 2.9 2:09.04 root.exe
29171 sphnxpro 20 0 4785864 1.6g 6960 R 99.0 3.4 7:07.58 root.exe
29769 sphnxpro 20 0 4682344 1.5g 30956 R 98.7 3.1 5:11.67 root.exe
30404 sphnxpro 20 0 4762580 1.6g 60776 R 98.7 3.5 4:03.98 root.exe
29509 sphnxpro 20 0 4878296 1.7g 33248 R 95.7 3.6 5:19.22 root.exe
30262 sphnxpro 20 0 4773848 1.6g 60136 R 93.4 3.4 4:01.12 root.exe
29542 sphnxpro 20 0 4665276 1.4g 33316 R 87.4 3.1 5:18.91 root.exe
30379 sphnxpro 20 0 4792176 1.6g 59192 R 83.1 3.5 4:02.14 root.exe
All older nodes have regular hard disks, so the reading takes time which
leads to <100% cpu for some jobs. I let it run for 10k jobs to see how
this holds up.
Not sure if this is visible for everybody, if you want to see how this
is doing, the grafana page for spool 0346 is:
https://urldefense.com/v3/__https://monitoring.sdcc.bnl.gov/grafana/d/000000026/linux-farm-collectd?orgId=1&refresh=1m&var-Experiment=spool&var-Hostname=spool0346_sdcc_bnl_gov&var-Interface=eth1&from=now-2d&to=now__;!!PhOWcWs!2wAZu_j0vW7oSLYgm_JQABoz3NdT-J2Azxqu5lxKAaunpaJXn46zA_nFlnk-dINF46rZehMSCB4Dk889K95afOS7b2yuRCfrug$
Chris
--
*************************************************************
Christopher H. Pinkenburg ; pinkenburg AT bnl.gov
; https://urldefense.com/v3/__http://www.phenix.bnl.gov/*pinkenbu__;fg!!PhOWcWs!2wAZu_j0vW7oSLYgm_JQABoz3NdT-J2Azxqu5lxKAaunpaJXn46zA_nFlnk-dINF46rZehMSCB4Dk889K95afOS7b2y7jFrZPA$
Brookhaven National Laboratory ; phone: (631) 344-5692
Physics Department Bldg 510 C ; fax: (631) 344-3253
Upton, NY 11973-5000
*************************************************************
_______________________________________________
sPHENIX-tracking-l mailing list
sPHENIX-tracking-l AT lists.bnl.gov
https://urldefense.com/v3/__https://lists.bnl.gov/mailman/listinfo/sphenix-tracking-l__;!!PhOWcWs!2wAZu_j0vW7oSLYgm_JQABoz3NdT-J2Azxqu5lxKAaunpaJXn46zA_nFlnk-dINF46rZehMSCB4Dk889K95afOS7b2y4sPbh-A$
-
[Sphenix-tracking-l] running job A on 2GB/core,
pinkenburg, 06/10/2022
-
Re: [Sphenix-tracking-l] running job A on 2GB/core,
Anthony Frawley, 06/10/2022
-
Message not available
-
Message not available
-
Message not available
- [Sphenix-tracking-l] Fw: running job A on 2GB/core, Anthony Frawley, 06/13/2022
-
Message not available
-
Message not available
-
Message not available
-
Re: [Sphenix-tracking-l] running job A on 2GB/core,
Anthony Frawley, 06/10/2022
Archive powered by MHonArc 2.6.24.