phys-npps-mgmt-l AT lists.bnl.gov

Subject: NPPS Leadership Team

List archive

Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects

From: Torre Wenaus <wenaus AT gmail.com>
To: pinkenburg <pinkenburg AT bnl.gov>
Cc: NPPS leadership team <phys-npps-mgmt-l AT lists.bnl.gov>
Subject: Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects
Date: Mon, 22 Mar 2021 11:17:49 -0400

Alexander thinks it's premature for him to talk about IP6 sw plans but he will attend on Friday.

Hong emailed me and Eric over the weekend saying basically that members of our groups shouldn't be joining the proto-collaborations at this point, our focus should be contributing to the common good, so consistent with our thinking.

Torre

On Fri, Mar 19, 2021 at 2:36 PM Torre Wenaus <wenaus AT gmail.com> wrote:

Sounds good
On Fri, Mar 19, 2021 at 2:26 PM pinkenburg via Phys-npps-mgmt-l <phys-npps-mgmt-l AT lists.bnl.gov> wrote:
I think we should hear from the ecce computing convener - I'd like to meet him. I can ping John Lajoie about it

Chris
.

On 3/19/2021 2:07 PM, Torre Wenaus via Phys-npps-mgmt-l wrote:

I talked to Jamie about NPPS EIC work, he agreed that considerations of sPHENIX synergy are a significant factor but he also emphasised that NPPS can't be seen as favoring one proto-collaboration over another (which of course works both ways). Which makes sense and I thought we were on solid ground with the common project. So conversations with both ECCE and IP6 on their software plans will be important. It's good that Alexander is an IP6 leader and it sounds like he's already working hard to turn people towards a sensible plan. I'll ask him to give an IP6 overview in our simu tech meeting next week if possible. Who would be a good ECCE person to hear from? I don't know their designated computing co-convener Cristiano Fanelli, and lasst I heard the other co-convener slot is unfilled as yet.

I also talked to Jamie about the Makoto common project, he was interested and supportive, and not optimistic about BNL being able to support a second phase because of an $8M reduction in his budget next year, but we agreed to discuss it a couple months into the project (ie ~3 months from now).

Torre

On Thu, Mar 18, 2021 at 8:59 AM Torre Wenaus <wenaus AT gmail.com> wrote:

Hi,
I don't expect any super interesting results from the PPS activity in GPU batching either, because I think they are taking too narrow a view of the problem, they are not treating it as a workflow problem broader than the framework which I think it is. A "GPU service" together with data marshalling, dispatch, merge etc services is how I think it should be approached. For which we are well suited with event streaming and iDDS. I raised the workflow element in PPS and it was squashed without a trace because anything workload/workflow always is. So I'm all for pulling this in house and separate from PPS. I think that we have capabilities that can enable us to something more general than FNAL's SONIC. And as Brett says, the fact that SONIC is there is all the more reason to keep distance from PPS on this. In that orbit we'll be told what we can and cannot do and whatever we do the credit will go elsewhere. So let's do it. It's in the same analysis services territory that Tadashi et al are finding a new use case for every month. Even without LDRD it may be interesting to look at GPUaaS in the Wire-Cell context as another PanDA/iDDS use case.
Torre

On Wed, Mar 17, 2021 at 2:45 PM Brett Viren <bv AT bnl.gov> wrote:

"Laycock, Paul via Phys-npps-mgmt-l" <phys-npps-mgmt-l AT lists.bnl.gov>
writes:

> LDRD Type A - DUNE part 2 could be a good match but reading this now I
> don’t know how much overlap there is with the HEP-CCE work.. Brett ?

The overlap would be one of our making or breaking. We can keep the
work away from HEP-CCE/PPS. Or, with not much nudge, might be taken up
by HEP-CCE/PPS.

Maybe too long to read, but more info on what that means:

The entry point here is that HEP-CCE/PPS folks are recently interested
in "GPU batch processing optimization" (note: not panda/condor's
definition of "batch").

This "batch" is mostly about sending ever larger contiguous chunks of
data to ever more sophisticated GPU algorithms in order to fight
throughput losses due to the latency involved in doing the same overall
algorithm but with smaller chunks sent more frequently between CPU and
GPU and back. There is one way to use WC sim to do "A/B" tests of the
two approaches and that is being considered in PPS. Frankly I don't
expect any super interesting results. But, the work would likely have a
side effect of getting more WC algs working on GPU which would be
welcome.

But then, "GPU batch optimization" can also be interpreted as how to not
stall-out the GPU for lack of input. This interpretation also came up
in the PPS discussion and it leads right to Wire-Cell's "GPU service"
idea. So far I've tried to hold my tongue in the PPS discussions to
suggest WC GPU service as a topic of work in case EDG, NPPS or others
"closer to home" could work on it.

There is some special concern here. FNAL has already done some work
using some proprietary nVidia GPU-as-a-service package. OTOH, WC's
solution uses open protocols. WC's started first but has a tiny
fraction of the effort that FNAL's put up. So, they already have a note
out and we have prototype software. Then, FNAL has representatives on
HEP-CCE/PPS. So, I worry that pushing the "GPU service" idea in PPS
turns into "apply FNAL's idea". I'm not at all interested in that.

In any case, I think DUNE and other HEP experiments need *something* in
this area to efficiently use CPUs+GPUs in a production setting. I think
an LDRD case could be made. Maybe Type B. But, I do have doubts if
Physics would want to support one. Getting a feel from the department
on their will in this direction would be a very useful outcome of this
strategy request.

-Brett.

--

-- Torre Wenaus, BNL NPPS Group, ATLAS Experiment

-- BNL 510A 1-222 | 631-681-7892 | wenaus AT gmail.com | npps.bnl.gov | wenaus.com
-- NPPS Mattermost room: https://chat.sdcc.bnl.gov/npps/channels/town-square

--

-- Torre Wenaus, BNL NPPS Group, ATLAS Experiment

-- BNL 510A 1-222 | 631-681-7892 | wenaus AT gmail.com | npps.bnl.gov | wenaus.com
-- NPPS Mattermost room: https://chat.sdcc.bnl.gov/npps/channels/town-square
_______________________________________________
Phys-npps-mgmt-l mailing list
Phys-npps-mgmt-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/phys-npps-mgmt-l
-- 
*************************************************************

Christopher H. Pinkenburg	;    pinkenburg AT bnl.gov
				;    http://www.phenix.bnl.gov/~pinkenbu

Brookhaven National Laboratory	;    phone: (631) 344-5692
Physics Department Bldg 510 C	;    fax:   (631) 344-3253
Upton, NY 11973-5000

*************************************************************
_______________________________________________
Phys-npps-mgmt-l mailing list
Phys-npps-mgmt-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/phys-npps-mgmt-l
--
-- Torre Wenaus, BNL NPPS Group, ATLAS Experiment
-- BNL 510A 1-222 | 631-681-7892 | wenaus AT gmail.com | npps.bnl.gov | wenaus.com
-- NPPS Mattermost room: https://chat.sdcc.bnl.gov/npps/channels/town-square

-- Torre Wenaus, BNL NPPS Group, ATLAS Experiment

-- BNL 510A 1-222 | 631-681-7892 | wenaus AT gmail.com | npps.bnl.gov | wenaus.com

-- NPPS Mattermost room: https://chat.sdcc.bnl.gov/npps/channels/town-square

[Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/16/2021
- Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Laycock, Paul, 03/16/2021
  - Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/16/2021
    - Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/16/2021
  - Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Brett Viren, 03/17/2021
    - Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/18/2021
      - Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/19/2021
        
        Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, pinkenburg, 03/19/2021
        
        Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/19/2021
        
        Re: [Phys-npps-mgmt-l] Discussion on Friday on the strategy input, and AI/ML LDRD prospects, Torre Wenaus, 03/22/2021