atlas-connect-l AT lists.bnl.gov
Subject: Atlas-connect-l mailing list
List archive
Re: [Atlas-connect-l] Held ATLAS Connect jobs OOMing workers
- From: Matt LeBlanc <matt.leblanc AT cern.ch>
- To: Lincoln Bryant <lincolnb AT uchicago.edu>
- Cc: atlas-connect-l <atlas-connect-l AT lists.bnl.gov>
- Subject: Re: [Atlas-connect-l] Held ATLAS Connect jobs OOMing workers
- Date: Thu, 26 Mar 2020 19:23:58 +0100
Hi Lincoln,
Sorry about that! Those are a new analysis, so there might be some hidden wrinkles to smooth out. I will kill them and look into this.
Cheers,
Matt
On Thu, Mar 26, 2020 at 19:14 Lincoln Bryant <lincolnb AT uchicago.edu> wrote:
Hi Matt,
Apologies but I've held your jobs on ATLAS Connect because they seem to be OOMing our worker nodes.
From the PS tree of a worker:
ruc.mwt2 198552 9.1 82.7 3956304672 162588084 ? Rl 12:47 1:59 | | \_ eventloop_batch_worker 96 ./config.root
thats 150GiB RSS memory for that job.
We have limited access to the machine room right now due to COVID19-related lockdown at the University, so I immediately held all of your jobs to prevent any other workers from rebooting.
Could you look into the memory utilization when you get a chance?
Thanks,Lincoln
Matt LeBlanc
University of Arizona
Office: 40/1-C11 (CERN)
https://cern.ch/mleblanc/
University of Arizona
Office: 40/1-C11 (CERN)
https://cern.ch/mleblanc/
-
[Atlas-connect-l] Held ATLAS Connect jobs OOMing workers,
Lincoln Bryant, 03/26/2020
-
Re: [Atlas-connect-l] Held ATLAS Connect jobs OOMing workers,
Matt LeBlanc, 03/26/2020
-
Re: [Atlas-connect-l] Held ATLAS Connect jobs OOMing workers,
Matt LeBlanc, 03/27/2020
- Re: [Atlas-connect-l] Held ATLAS Connect jobs OOMing workers, Matt LeBlanc, 03/27/2020
-
Re: [Atlas-connect-l] Held ATLAS Connect jobs OOMing workers,
Matt LeBlanc, 03/27/2020
-
Re: [Atlas-connect-l] Held ATLAS Connect jobs OOMing workers,
Matt LeBlanc, 03/26/2020
Archive powered by MHonArc 2.6.24.