usatlas-hllhc-computing-l AT lists.bnl.gov
Subject: US ATLAS HL-LHC computing discussion
List archive
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group
- From: "Malik,Abid" <amalik AT bnl.gov>
- To: Torre Wenaus <wenaus AT gmail.com>, Rob Gardner <rwg AT uchicago.edu>
- Cc: "usatlas-hllhc-computing-l AT lists.bnl.gov" <usatlas-hllhc-computing-l AT lists.bnl.gov>
- Subject: Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group
- Date: Mon, 13 Aug 2018 20:48:35 +0000
Dear all,
I have started a google doc. for tomorrow's meeting.
https://docs.google.com/document/d/1w5CXzODmv5Z9HoEpmORTi8-sh6vChkLX4KCajCoFX-s/edit?usp=sharing
Please feel free to add items for discussion.
Regards,
Abid M. Malik
Computational Science Initiative (CSI)
Bld-725, 2-131
Phone#6313444657
________________________________________
From: Usatlas-hllhc-computing-l
<usatlas-hllhc-computing-l-bounces AT lists.bnl.gov> on behalf of Torre Wenaus
<wenaus AT gmail.com>
Sent: Friday, August 10, 2018 9:44 AM
To: Rob Gardner
Cc: usatlas-hllhc-computing-l AT lists.bnl.gov
Subject: Re: [Usatlas-hllhc-computing-l] First meeting of the distributed
training working group
Oops, as Doug just noticed, that time conflicts with our new event
service/HPC ops meeting.
So we will meet Tue Aug 14 at noon eastern. Sorry for the noise.
Torre
On Fri, Aug 10, 2018 at 3:26 PM Torre Wenaus
<wenaus AT gmail.com<mailto:wenaus AT gmail.com>> wrote:
We’ll have the first meeting of the distributed training WG on Wed Aug 15
10am eastern time. Thanks to all who responded to the poll. Agenda to follow,
suggestions appreciated. We’ll use Vidyo (the ATLAS/CERN standard
videoconferencing tool) unless that presents a problem for someone.
Abid & Torre
On Fri, Aug 10, 2018 at 3:21 PM Torre Wenaus
<wenaus AT gmail.com<mailto:wenaus AT gmail.com>> wrote:
Hi Rob,
Very interesting, thanks! It does sound relevant. For a future meeting you're
able to attend...
Torre
On Fri, Aug 10, 2018 at 1:53 PM Robert Gardner
<rwg AT uchicago.edu<mailto:rwg AT uchicago.edu>> wrote:
Hi Torre
Unfortunately I’m on ‘staycation’ until Aug 22. I keep missing these
important meetings! I did want to mention potentially relevant work we did
supporting this year’s CoDaS-HEP training event at Princeton
(http://codas-hep.org). We (Ilija, Benedikt, Lincoln) built a portal and
backend to scale out to CHASE-CI — gpu resources on the Pacific Research
Platform: http://codas.slateci.net/. There were some lessons learned there
about sign-ups (using institutional identity management) and Kubernetes
scheduling (we had 60 JupyterLabs running each attached to its own GPU). We
plan to create a version of this to support the new ATLAS analytics platform
(for ML to Elasticsearch for ADC analytics), in time for the Oct S&C week.
It looks like this might be a little outside the scope here but perhaps parts
could be re-purposed for potential user frontends.
Cheers,
- Rob
On Aug 7, 2018, at 9:42 AM, Torre Wenaus
<wenaus AT gmail.com<mailto:wenaus AT gmail.com>> wrote:
Hi,
We are planning a first meeting of the distributed training working group,
one of the working groups defined at last month’s US ATLAS / CSI workshop at
BNL. If you’re interested in attending please fill in the doodle:
https://doodle.com/poll/iayykxwd94isqdf4
The distributed training WG is to examine the scaling out of ML training
across distributed/parallel resources in order to minimize the turnaround
time on network tuning and ML studies. The technical approaches to be looked
at include those discussed in the workshop; cf. the talks of Abid (Horovod),
Alexei (PanDA), and Amir. It was agreed at the workshop to define concrete
objectives for the WG by the end of September, so that will be the main topic
of this meeting.
Abid & Torre
_______________________________________________
Usatlas-hllhc-computing-l mailing list
Usatlas-hllhc-computing-l AT lists.bnl.gov<mailto:Usatlas-hllhc-computing-l AT lists.bnl.gov>
https://lists.bnl.gov/mailman/listinfo/usatlas-hllhc-computing-l
-
[Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/07/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Robert Gardner, 08/10/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/10/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/10/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/10/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Malik,Abid, 08/13/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/14/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Malik,Abid, 08/14/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Torre Wenaus, 08/14/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Malik,Abid, 08/21/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Torre Wenaus, 08/21/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Malik,Abid, 08/21/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Torre Wenaus, 08/21/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Malik,Abid, 08/21/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Torre Wenaus, 08/28/2018
- Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group, Torre Wenaus, 08/28/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/14/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Malik,Abid, 08/13/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/10/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/10/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Torre Wenaus, 08/10/2018
-
Re: [Usatlas-hllhc-computing-l] First meeting of the distributed training working group,
Robert Gardner, 08/10/2018
Archive powered by MHonArc 2.6.24.