Skip to Content.
Sympa Menu

star-fwd-software-l - Re: [[Star-fwd-software-l] ] Job submission issue (and solution)

star-fwd-software-l AT lists.bnl.gov

Subject: FWD Software

List archive

Chronological Thread  
  • From: "Kapukchyan, David" <david.kapukchyan AT cvut.cz>
  • To: "Van Buren, Gene" <gene AT bnl.gov>, "Jindal, Nicholas" <jindal.78 AT buckeyemail.osu.edu>
  • Cc: "star-fwd-software-l AT lists.bnl.gov" <star-fwd-software-l AT lists.bnl.gov>
  • Subject: Re: [[Star-fwd-software-l] ] Job submission issue (and solution)
  • Date: Fri, 24 Apr 2026 14:44:54 +0000

Hello Gene, The issue was that when storage was set to local, the “storage” location that the get_file_list. pl command was returning was some qgp node as can be seen in the attachment. This had the effect of the condor job file adding an additional
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
 
ZjQcmQRYFpfptBannerEnd
Hello Gene,

The issue was that when storage was set to local, the “storage” location that the get_file_list.pl command was returning was some qgp node as can be seen in the attachment. This had the effect of the condor job file adding an additional requirement that the job node also match this qgp node, which doesn’t exist; therefore the jobs were staying on idle and then getting kicked out after the time limit. It is unclear whether this a problem of the STAR scheduler seeing this node and adding it to the job requirements for some reason, or if this an error from get_file_list.pl on what should be returned from “storage”. Switching to using “NFS” for the storage requirement was just to test that this was indeed what was happening and I agree it is a temporary fix.

Best,
David

PS: Nick was using star-submit-beta.

From: star-fwd-software-l-request AT lists.bnl.gov <star-fwd-software-l-request AT lists.bnl.gov> on behalf of Van Buren, Gene <gene AT bnl.gov>
Date: Friday, April 24, 2026 at 04:08
To: Jindal, Nicholas <jindal.78 AT buckeyemail.osu.edu>
Cc: star-fwd-software-l AT lists.bnl.gov <star-fwd-software-l AT lists.bnl.gov>
Subject: Re: [[Star-fwd-software-l] ] Job submission issue (and solution)

Hi, Nick

Thanks for the explanation.... Some comments....

1) storage=nfs is not a reliable choice in future analysis passes, as I do not have enough room to hold everything on NFS, and will begin deleting there in a month or so (the production will take many more months). But xrootd will house the entire dataset.

2) I believe that storage=local should work, as other people running other analyses are doing. It would be helpful to me if you could give a specific example of a file which succeeded using nfs but failed using xrootd.

-Gene


> On Apr 23, 2026, at 12:09 PM, Jindal, Nicholas <star-fwd-software-l AT lists.bnl.gov> wrote:
>
> This Message Is From an External Sender
> This message came from outside your organization.
> Hi All,
>
> David helped me solve this job submission issue I was having on the new nodes so I thought I’d pass along the fix. Before switching over to the alma9 nodes, getting a file list with something like
>
> 'get_file_list.pl -keys fdid,storage,site,node,path,filename,events -cond production=P25ib,trgsetupname=production_pp500_2022,filetype=daq_reco_picodst,storage=local’
>
> worked just fine when used in an xml submission script like this:
> '<input URL=""catalog:star.bnl.gov?production=P25ib,trgsetupname=production_pp500_2022,filetype=daq_reco_picodst,storage=local"" nFiles="1" />’.
>
> Now, using 'storage=local' seems to impose a requirement on the node the jobs are run on, which don’t match any of the alma9 nodes so the jobs will fail. Changing 'storage=local' to ’storage=nfs’ fixes it. I’ve attached a screenshot of what running the get_file_list.pl command returns when asking for ’storage=local’.
>
> I am curious if anyone, particularly Gene, knows why this failed without this change?
>
> Best,
> --
> Nicholas Jindal
> PhD Candidate
> The Ohio State University Department of Physics
> jindal.78 AT osu.edu<Screenshot 2026-04-23 at 12.01.09 PM.png>





Archive powered by MHonArc 2.6.24.

Top of Page