Skip to Content.
Sympa Menu

star-fcv-l - Re: [Star-fcv-l] [Starsoft-l] Hard to accessing full statics of datasets, any solution?

star-fcv-l AT lists.bnl.gov

Subject: STAR Flow, Chirality and Vorticity PWG

List archive

Chronological Thread  
  • From: Ding Chen <dchen087 AT ucr.edu>
  • To: ChuanFu <fuchuan AT mails.ccnu.edu.cn>
  • Cc: jeromel <jeromel AT bnl.gov>, STAR Software issues of broad in <starsoft-l AT lists.bnl.gov>, "STAR Flow, Chirality and Vorticity PWG" <star-fcv-l AT lists.bnl.gov>
  • Subject: Re: [Star-fcv-l] [Starsoft-l] Hard to accessing full statics of datasets, any solution?
  • Date: Sat, 4 Dec 2021 22:36:33 -0500

Dear Chuan, all

Thank you for your package! I've tested it on production_26p5GeV_fixedTarget_2018 and figured out:

3134/8439,  37% of the PicoDSTs of this production are not accessible.

Here is an example, for a readable file, one can xrdcp it to RCF path:

For an unreadable file, often it will return a 3011 error or will be stuck for an indefinite time:
[rcas6006] ~/7p2_check/> xrdcp root://xrdstar.rcf.bnl.gov:1095//home/starlib/home/starreco/reco/production_26p5GeV_fixedTarget_2018/ReversedFullField/P19ie.SL20d/2018/158/19158057/st_physics_19158057_raw_1500003.picoDst.root .
[0B/0B][100%][==================================================][0B/s]  
Run: [ERROR] Server responded with an error: [3011] No servers are available to read the file.

 Thank you all for your input and discussion. I think it's a vital issue.

Since S&C general meeting has already provided nice solutions:

1) For the near term, we plan to ask the PWGC to identify a limited amount of datasets (cannot be all, better to be small too) that are widely used and have high priority. We can make duplicate copies over other Xrootd servers to reduce the load on a single server.

Considering the approaching QM2022, I'd like to suggest PWGC to select some BES-II and FXT datasets for duplication, including but not limited to production_19GeV_2019, production_3p85GeV_fixedTarget_2018, production_26p5GeV_fixedTarget_2018 ...


You can find the full file list in 
/star/data01/pwg/dchen/Ana/7p2GeV_FXT_2018_psi2/check/EpdAna/26p5_local.list
the unreadable filelist:
/star/data01/pwg/dchen/Ana/7p2GeV_FXT_2018_psi2/check/EpdAna/ResubFinal.list

Thank you for your attention!
Best,
Ding


---------- Forwarded message ---------
From: Xin Dong via Starsoft-l <starsoft-l AT lists.bnl.gov>
Date: Wed, Dec 1, 2021 at 3:58 PM
Subject: Re: [Starsoft-l] [Star-fcv-l] Hard to accessing full statics of datasets, any solution?
To: Ziyue Zhang <zzhan70 AT uic.edu>, STAR Software issues of broad interest <starsoft-l AT lists.bnl.gov>


Dear Colleagues,

We discussed this issue at today's S&C general meeting. Given the symptoms observed, it is likely this specific dataset is not evenly distributed on Xrootd while now we have a limited number of servers. Too many jobs will create the traffic issue on a single server. Therefore, Chuan's workaround by splitting into several small batches seems to be a temporary solution. We discussed how to improve for the near term and longer term.

1) For the near term, we plan to ask the PWGC to identify a limited amount of datasets (cannot be all, better to be small too) that are widely used and have high priority. We can make duplicate copies over other Xrootd servers to reduce the load on a single server.

2) For the longer term, the SDCC is looking into the possibilities to increase the number of Xrootd servers.

Thank you all for reporting the findings. Please follow up with your relevant PWG(s) to discuss item #1) on the high priority datasets.

Best Regards

/xin

On Wed, Dec 1, 2021 at 12:32 AM ChuanFu <fuchuan AT mails.ccnu.edu.cn> wrote:
Dear Racz and Ding,
                      I also meet the similar issue if the input picoDst is from local (root://xrdstar.rcf.bnl.gov:1095//home/starlib/home/starreco/reco/....).
The lost events will be reduced obviously when I used the following mothod ( for 3.85 GeV):
1) Get the full data list (~13000 picoDst) using the following code:
get_file_list.pl -keys path,filename -cond production=P19ie,library=SL20d,trgsetupname=production_3p85GeV_fixedTarget_2018,filetype=daq_reco_picoDst,filename~st_physics,storage=LOCAL -limit 0 -delim "/" > 3p85_local.list
2) Divide the full data list into 4 sublists (sublist1, sublist2, sublist3, sublist4)
3) Submit jobs using '.xml' (<job fileListSyntax="xrootd" maxFilesPerProcess="10" simulateSubmission="false">) with input sublist1,
after 1~2 hours (depond on how many your jobs are running, if your jobs do not start to run, we need to wait more times) 
then submit jobs with input sublist2, after 1~2 hours then submit jobs with input sublist3 ...
The purpose of this is to avoid many picoDst from local (such as more than 500 jobs) are being read at the same time.
Here is my submission scripts: /star/u/fuchuan/3_85FXT/Analysis/v0Tree_Proton_Lm/submitAll.sh (and submit.xml)
4) After all jobs are finished (about half day), you can find the input picoDst which is not read in your log files and resubmit those picoDst list.
Here is my script for finding the unreadable picoDst: /star/u/fuchuan/3_85FXT/Analysis/v0Tree_Proton_Lm/Find3011Err.sh

I am not sure the above method is useful for you, but you could try it if you have not better method.

Best regards,
Chuan
 
------------------ Original ------------------
From:  "Cameron Racz via Star-fcv-l"<star-fcv-l AT lists.bnl.gov>;
Date:  Wed, Dec 1, 2021 11:12 AM
To:  "Ding Chen"<dchen087 AT ucr.edu>; "STAR Software issues of broad in"<starsoft-l AT lists.bnl.gov>;
Cc:  "jeromel"<jeromel AT bnl.gov>; "STAR Flow,Chirality and Vorticity PWG"<star-fcv-l AT lists.bnl.gov>;
Subject:  Re: [Star-fcv-l] [Starsoft-l] Hard to accessing full statics of datasets, any solution?
 
To add some more data to this I’d just like to add that, for my analysis of production_3p85GeV_fixedTarget_2018 (library SL20d), the amount of the picoDsts I can access fluctuates wildly between every attempt to analyze it. I should see around 275M good events and my most recent attempt accessed less than 25M successfully.

Since my flow analysis requires multiple iterations over the same data it’s becoming difficult to get any meaningful results. I will also be needing to reliably access the 7.2 GeV data that Ding is mentioning to fully prepare for the Quark Matter conference and this data problem is really slowing progress down for that.

Cameron Racz
Graduate Student
Dept. of Physics & Astronomy
University of California, Riverside




On Nov 30, 2021, at 9:32 PM, Ding Chen via Starsoft-l <starsoft-l AT lists.bnl.gov> wrote:

Dear FCV and experts,

I want to complain it is hard to access the full statistics of many datasets and it's not just me.

For Run19 19.6 GeV, analyzers find themselves needing to re-submit more than 8 times to get more than 80% of the statistics.

For Run18 FXT 3 (3.85)  GeV data, analyzers find the statistics are 20% less, need re-submit multiple times to reach 90% of the statistics.

For Run18 FXT 7.2 (26.5) GeV data, I can only get less than 50% percent of full statistics.

Adding to that, many are bugged with the notorious "3011" error, which will kill the whole job if one file has such an error.

When the 7.2 GeV data was stored at NFS, I had no issue accessing the full statistics. I suspect that it's due to some issues on distributed disk (DD), or the communication with it, but I'm no expert on that.

Since it impacts many analyses. I'd like to know why this is the problem and more importantly if there's any solution to that?

Best regards,
Ding
--
Ding Chen
Graduate student - University of California, Riverside

_______________________________________________
Starsoft-l mailing list
Starsoft-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/starsoft-l




Archive powered by MHonArc 2.6.24.

Top of Page