phys-npps-mgmt-l AT lists.bnl.gov
Subject: NPPS Leadership Team
List archive
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects
- From: pinkenburg <pinkenburg AT bnl.gov>
- To: phys-npps-mgmt-l AT lists.bnl.gov
- Subject: Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects
- Date: Fri, 19 Mar 2021 14:26:35 -0400
I think we should hear from the ecce computing convener - I'd like to meet him. I can ping John Lajoie about it
Chris
.
On 3/19/2021 2:07 PM, Torre Wenaus via
Phys-npps-mgmt-l wrote:
I talked to Jamie about NPPS EIC
work, he agreed that considerations of sPHENIX synergy are a
significant factor but he also emphasised that NPPS can't be
seen as favoring one proto-collaboration over another (which
of course works both ways). Which makes sense and I thought
we were on solid ground with the common project. So
conversations with both ECCE and IP6 on their software plans
will be important. It's good that Alexander is an IP6 leader
and it sounds like he's already working hard to turn people
towards a sensible plan. I'll ask him to give an IP6
overview in our simu tech meeting next week if possible. Who
would be a good ECCE person to hear from? I don't know their
designated computing co-convener Cristiano Fanelli, and
lasst I heard the other co-convener slot is unfilled as yet.
I also talked to Jamie about the
Makoto common project, he was interested and supportive, and
not optimistic about BNL being able to support a second
phase because of an $8M reduction in his budget next year,
but we agreed to discuss it a couple months into the project
(ie ~3 months from now).
Torre
On Thu, Mar 18, 2021 at 8:59
AM Torre Wenaus <wenaus AT gmail.com> wrote:
Hi,
I don't expect any super interesting results from the PPS activity in GPU batching either, because I think they are taking too narrow a view of the problem, they are not treating it as a workflow problem broader than the framework which I think it is. A "GPU service" together with data marshalling, dispatch, merge etc services is how I think it should be approached. For which we are well suited with event streaming and iDDS. I raised the workflow element in PPS and it was squashed without a trace because anything workload/workflow always is. So I'm all for pulling this in house and separate from PPS. I think that we have capabilities that can enable us to something more general than FNAL's SONIC. And as Brett says, the fact that SONIC is there is all the more reason to keep distance from PPS on this. In that orbit we'll be told what we can and cannot do and whatever we do the credit will go elsewhere. So let's do it. It's in the same analysis services territory that Tadashi et al are finding a new use case for every month. Even without LDRD it may be interesting to look at GPUaaS in the Wire-Cell context as another PanDA/iDDS use case.
Torre
I don't expect any super interesting results from the PPS activity in GPU batching either, because I think they are taking too narrow a view of the problem, they are not treating it as a workflow problem broader than the framework which I think it is. A "GPU service" together with data marshalling, dispatch, merge etc services is how I think it should be approached. For which we are well suited with event streaming and iDDS. I raised the workflow element in PPS and it was squashed without a trace because anything workload/workflow always is. So I'm all for pulling this in house and separate from PPS. I think that we have capabilities that can enable us to something more general than FNAL's SONIC. And as Brett says, the fact that SONIC is there is all the more reason to keep distance from PPS on this. In that orbit we'll be told what we can and cannot do and whatever we do the credit will go elsewhere. So let's do it. It's in the same analysis services territory that Tadashi et al are finding a new use case for every month. Even without LDRD it may be interesting to look at GPUaaS in the Wire-Cell context as another PanDA/iDDS use case.
Torre
On Wed, Mar 17, 2021 at
2:45 PM Brett Viren <bv AT bnl.gov>
wrote:
"Laycock, Paul via
Phys-npps-mgmt-l" <phys-npps-mgmt-l AT lists.bnl.gov>writes:
> LDRD Type A - DUNE part 2 could be a good match but reading this now I
> don’t know how much overlap there is with the HEP-CCE work.. Brett ?
The overlap would be one of our making or breaking. We can keep the
work away from HEP-CCE/PPS. Or, with not much nudge, might be taken up
by HEP-CCE/PPS.
Maybe too long to read, but more info on what that means:
The entry point here is that HEP-CCE/PPS folks are recently interested
in "GPU batch processing optimization" (note: not panda/condor's
definition of "batch").
This "batch" is mostly about sending ever larger contiguous chunks of
data to ever more sophisticated GPU algorithms in order to fight
throughput losses due to the latency involved in doing the same overall
algorithm but with smaller chunks sent more frequently between CPU and
GPU and back. There is one way to use WC sim to do "A/B" tests of the
two approaches and that is being considered in PPS. Frankly I don't
expect any super interesting results. But, the work would likely have a
side effect of getting more WC algs working on GPU which would be
welcome.
But then, "GPU batch optimization" can also be interpreted as how to not
stall-out the GPU for lack of input. This interpretation also came up
in the PPS discussion and it leads right to Wire-Cell's "GPU service"
idea. So far I've tried to hold my tongue in the PPS discussions to
suggest WC GPU service as a topic of work in case EDG, NPPS or others
"closer to home" could work on it.
There is some special concern here. FNAL has already done some work
using some proprietary nVidia GPU-as-a-service package. OTOH, WC's
solution uses open protocols. WC's started first but has a tiny
fraction of the effort that FNAL's put up. So, they already have a note
out and we have prototype software. Then, FNAL has representatives on
HEP-CCE/PPS. So, I worry that pushing the "GPU service" idea in PPS
turns into "apply FNAL's idea". I'm not at all interested in that.
In any case, I think DUNE and other HEP experiments need *something* in
this area to efficiently use CPUs+GPUs in a production setting. I think
an LDRD case could be made. Maybe Type B. But, I do have doubts if
Physics would want to support one. Getting a feel from the department
on their will in this direction would be a very useful outcome of this
strategy request.
-Brett.
-- Torre Wenaus, BNL NPPS Group, ATLAS
Experiment
-- BNL 510A 1-222 | 631-681-7892 | wenaus AT gmail.com | npps.bnl.gov |
wenaus.com
-- NPPS Mattermost room: https://chat.sdcc.bnl.gov/npps/channels/town-square-- Torre Wenaus, BNL NPPS Group, ATLAS
Experiment
-- BNL 510A 1-222 | 631-681-7892 | wenaus AT gmail.com | npps.bnl.gov | wenaus.com
-- NPPS Mattermost room: https://chat.sdcc.bnl.gov/npps/channels/town-square_______________________________________________ Phys-npps-mgmt-l mailing list Phys-npps-mgmt-l AT lists.bnl.gov https://lists.bnl.gov/mailman/listinfo/phys-npps-mgmt-l
-- ************************************************************* Christopher H. Pinkenburg ; pinkenburg AT bnl.gov ; http://www.phenix.bnl.gov/~pinkenbu Brookhaven National Laboratory ; phone: (631) 344-5692 Physics Department Bldg 510 C ; fax: (631) 344-3253 Upton, NY 11973-5000 *************************************************************
-
[Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/16/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Laycock, Paul, 03/16/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/16/2021
- Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/16/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Brett Viren, 03/17/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/18/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/19/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
pinkenburg, 03/19/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/19/2021
- Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/22/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/19/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
pinkenburg, 03/19/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/19/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/18/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Torre Wenaus, 03/16/2021
-
Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects,
Laycock, Paul, 03/16/2021
Archive powered by MHonArc 2.6.24.