Skip to Content.
Sympa Menu

star-hp-l - Re: [Star-hp-l] STAR presentation by Youqi Song for Hot Quarks 2022 submitted for review

star-hp-l AT lists.bnl.gov

Subject: STAR HardProbes PWG

List archive

Chronological Thread  
  • From: Youqi Song <youqi.song AT yale.edu>
  • To: Sooraj Radhakrishnan <skradhakrishnan AT lbl.gov>
  • Cc: STAR HardProbes PWG <star-hp-l AT lists.bnl.gov>
  • Subject: Re: [Star-hp-l] STAR presentation by Youqi Song for Hot Quarks 2022 submitted for review
  • Date: Thu, 6 Oct 2022 14:53:07 -0400

Hi Sooraj,

Thanks for the questions. Please find my responses below.

S16: What are these plots comparing? Unfolded distributions to detector level ones? What is the conclusion you want to draw here?
Yes, they show the detector level distributions compared to the unfolded ones. The point is to show that MultiFold is unfolding 6 observables simultaneously and is unbinned. The difference between the red and the black is to show the need for unfolding and the effect of unfolding.

S18: what are these weights used for? Its not discussed in the slides
These are the weights such that when data is weighted by w(x), it is statistically identical to sim. The final output (after several iterations where each iteration has 2 steps, which looks like the "weight" column in the table of slide 15) from MultiFold is a generalized version of these weights.

S21: in the analysis where do we want to distinguish between jets from data and simulation? Isnt that something we know? May be the discussion on these parts can be improved 
Yes, we know the difference between data and simulation, but the point here is to let the neural network learn this difference through training, and after training, we can use the neural network output f(x) to approximate w(x). If there's more time, I would have added more slides to clarify the details, but with the constraints we have I think I'll just explain verbally and not have too many slides that look technical in terms of the machine learning algorithm.

S22: the lower box, is this an explanation of where machine learning comes into play?
Yes, this is to summarize that we've turned the traditional iterative Bayesian unfolding problem into an exercise of reweighting histogram. And we can train neural networks to estimate these weights where the training is done on a classification problem.

Best,
Youqi



On Thu, Oct 6, 2022 at 7:26 AM Sooraj Radhakrishnan <skradhakrishnan AT lbl.gov> wrote:
Hi Youqi,
   Please find some comments from me on your nicely prepared slides and these nice set of results

S16: What are these plots comparing? Unfolded distributions to detector level ones? What is the conclusion you want to draw here?
S18: what are these weights used for? Its not discussed in the slides
S21: in the analysis where do we want to distinguish between jets from data and simulation? Isnt that something we know? May be the discussion on these parts can be improved 
S22: the lower box, is this an explanation of where machine learning comes into play?

thanks
Sooraj



On Thu, Oct 6, 2022 at 11:04 AM Nihar Sahoo via Star-hp-l <star-hp-l AT lists.bnl.gov> wrote:
Hello Youqi,

Thank you for your answering my queries.
I don't have any further comments. It is good to see your analysis
results with STAR preliminary.

I sign off.


Cheers
Nihar


On 2022-10-06 02:05, Youqi Song wrote:
> Hi Nihar, Yi and Barbara,
>
> I implemented your suggestions in the updated slides:
> https://drupal.star.bnl.gov/STAR/system/files/hq_101522_v4.pdf
> I also moved several slides to backup in the interest of time.
>
> Response to Nihar's comments:
>
>> I recall we had a discuss on this topic. how to present these small
>> differences with sys and stat. uncertainties such that within total
>> uncertainties the closure will validate this method. Overall it
>> looks
>> good. May be we can discuss later on this topic.
>
> Just to clarify, for systematic uncertainty of the closure test, it
> would just be the difference in closure between the nominal procedure
> and for example unfolding with a Herwig prior, correct? I included in
> slide 42 (in backup) the closure when unfolding with Herwig mass and
> charge weights. I chose to look at the Herwig weight variation for
> closure since it is for most cases the largest source of systematic
> uncertainty in data. In any case, I moved all the closure test slides
> to backup to fit in the 15 min of time, and I think the agreement with
> the RooUnfold result is also proof that MultiFold works.
>
>> Can you please inform me why do we need these two samples?
>
> For closure test, one sample serves as prior, and the other serves as
> pseudodata and truth-level information to be compared with the
> unfolded. I think this is the same procedure used for other people's
> analyses.
>
> Best,
> Youqi
>
> On Tue, Oct 4, 2022 at 7:03 AM Nihar Sahoo <nihar AT rcf.rhic.bnl.gov>
> wrote:
>
>> Hello Youqi,
>>
>> Thank you for implementing my suggestion.
>> Please find my replies inline.
>>
>> On 2022-10-03 23:08, Youqi Song wrote:
>>> Hi Barbara and Nihar,
>>>
>>> Please find my updated slides here
>>> https://drupal.star.bnl.gov/STAR/system/files/hq_1022_v2.pdf
>>>
>>> Response to comments (unmentioned ones are already implemented in
>> the
>>> slides):
>>>
>>>> - Make sure you are fine in terms of time.
>>>
>>> I'm planning to practice these two days, and if I run out of time,
>> I
>>> will move slides 13-22 to backup. And if that's not enough, I will
>>> also move slide 31 to backup.
>>>
>>>> _"Data" -> can you please mention what data that is? Do you mean
>>>> this is
>>>> pp200 Gev run12 data?
>>>> _ Please give some information about this "data"?
>>>
>>> (Previously on slide 9, now on slide 12). This "data" is
>>> PYTHIA6+GEANT simulation, so I put it in quotation marks. I could
>> show
>>> these distributions for the actual data, but I assume that would
>>> require me to put in systematic and statistical uncertainties for
>> all
>>> these observables in, which might not be necessary for the goal of
>>> this slide, which is simply to show that MultiFold is unfolding 6
>>> observables simultaneously and is unbinned. The difference between
>> the
>>> red and the black is to show the need for unfolding and the effect
>> of
>>> unfolding.
>>>
>> Then I would not label it "Data". Just say "PYTHIA6+Geant" (it is
>> understood that is why you need Multifold)
>> Besides, Can you please mention "p+p sqrt(s) = 200 GeV" outside or
>> inside the figiures to indicate the collision energy?
>>
>>>> Can you say something about this weights? Like where and how do
>> you
>>>> get
>>>> this?
>>>
>>> (Previously on slide 12, now still on slide 12). These weights are
>>> exactly the output of MultiFold. (Would you like me to elaborate
>> more
>>> on this?)
>> It would be good to put a few words there although you can say in
>> the
>> presentation.
>>
>>>
>>>> There are two small plots, not visible at all.
>>>> Can you please make it bigger and clear, and mention how it is
>>>> related
>>>> to your neural network technique?
>>>
>>> (Previously on slides 18-19, now still on 18-19). I removed one
>> of
>>> the plots and made the other one bigger. The choice of these
>> neural
>>> network activation functions are default from the original
>> OmniFold
>>> paper.
>>
>> Good.
>>>
>>>> _ M>1 GeV/c^2 -> Do you use the same cut for unfolding while
>>>> training
>>>> simulated from the real data? Or while making response matrix.
>>>
>>> (Now on slide 26). I used the same cuts for PYTHIA6+GEANT
>> simulation.
>> thanks for clarification.
>>>
>>>> _ I recall, we had a discussion earlier that we need systematic
>>>> uncertainty along with your statistical uncertainty for these
>> plots
>>>> in
>>>> order to validate this closure. Any progress in that direction.
>>>
>>>> _ For your jet pT case, there is a difference at some bins, I
>> think
>>>> if
>>>
>>>> you use your systematic uncertainties then it would be
>> consistent.
>>>> Any
>>>
>>>> comment?
>>>
>>> (Previously on slide 27, now on slide 29). I don't think this was
>>> brought up before for my analysis. Maybe it was for Monika's? The
>>> difference in pT here is mostly because the normalization is done
>> per
>>> jet, not per event as what's usually done for pT, so a small
>> deviation
>>> at the low pT bin will cause a large deviation in the opposite
>>> direction at high pT.
>>
>> I recall we had a discuss on this topic. how to present these small
>> differences with sys and stat. uncertainties such that within total
>> uncertainties the closure will validate this method. Overall it
>> looks
>> good. May be we can discuss later on this topic.
>>>
>>
>>>> _"embedding jets into 2 statistically independent samples " ->
>> what
>>>> are
>>>> those 2 statistical ind. samples? Need some explanation.
>>>
>>> I added slides 27-28 to clarify this. The statistically
>> independent
>>> samples are drawn randomly from matched jet pairs from PYTHIA and
>>> embedding.
>>>
>> Can you please inform me why do we need these two samples?
>>
>>>> _Be prepared for it if somebody ask any comment on systematic
>>>> uncertainty comparison between two unfolding methods. Can you
>> please
>>>> mention here what would be your answer?
>>>
>>> I would say that the systematic uncertainty is roughly the same
>>> between RooUnfold and MultiFold, just by eyeballing the error band
>>> sizes on slide 30.
>> Good.
>>>
>>>> _I think it is important to show right plot with "STAR
>> preliminary".
>>>
>>> (Now on slide 31). I also put the figure here:
>>>
>>
> https://drupal.star.bnl.gov/STAR/system/files/20_25_all_err2_0925.pdf
>>>
>>>> _"Wider jets tend to have lower |Q|" -> how do you get this
>>>> conclusion?
>>>
>>> Since I used a track-pT-weighted definition of jet charge, a high
>> pT
>>> track will tend to make jet |Q| larger. And if a track has a high
>> pT
>>> within a jet, it is likely to be in roughly the same direction as
>> the
>>> jet, so the jet is more likely to be collimated, so collimated
>> jets
>>> tend to have large |Q| and wider jets tend to have lower |Q|.
>>>
>> OK, then you need to use followings.
>> In this slide:
>> "... increasing pT" -> "increasing jet pT"
>>
>> Here you need to mention " jet pT" and "constituent pT of a jet" in
>> this
>> slide.
>>
>>>> _ "Different fragmentation patter" -> do you mean it is because
>> of
>>>> their
>>>> different jet Mass?
>>>
>>> (Now on slide 33). I meant that it's because of both their jet
>> mass
>>> and charge. I think jet charge also relates to fragmentation since
>> it
>>> contains information about the track pT's.
>>>
>> OK.
>>
>> Thank you
>> Nihar
>>
>>> Best,
>>> Youqi
>>>
>>> On Sun, Oct 2, 2022 at 2:01 PM Youqi Song <youqi.song AT yale.edu>
>> wrote:
>>>
>>>> Hi Barbara and Nihar,
>>>>
>>>> Thanks for the suggestions! I will respond to the comments and
>>>> update a new version of slides by tomorrow. Nihar, since you
>> suggest
>>>> that I show the uncertainty plot as a preliminary figure, I
>> remade
>>>> it and attached it to this email. Please let me know if you have
>> any
>>>> comments for this figure.
>>>>
>>>> Best,
>>>> Youqi
>>>>
>>>> On Sun, Oct 2, 2022 at 10:17 AM Nihar Sahoo via Star-hp-l
>>>> <star-hp-l AT lists.bnl.gov> wrote:
>>>>
>>>>> Hello Youqi,
>>>>>
>>>>> Please find my comments on your nice presentation slides!
>>>>>
>>>>> Slide4-5:
>>>>> "Jet substructure measurements tell us …" -> "Jet substructure
>>>>> measurements can tell us …"
>>>>> (It can tell us something related to frag. and hadronization,
>> but
>>>>> not
>>>>> definitely)
>>>>>
>>>>> Slide8:
>>>>> _Iterative Bayesian Unfolding (please give reference)
>>>>> _"but this is this is the …" -> "but this is the …"
>>>>>
>>>>> Slide9:
>>>>> _"Data" -> can you please mention what data that is? Do you mean
>>>>> this is
>>>>> pp200 Gev run12 data?
>>>>> _ Please give some information about this "data"?
>>>>> _This slide appears abruptly after slide8, can you please
>>>>> introduce some
>>>>> information here?
>>>>> Slide10-11:
>>>>> All these expressions for Qj, M, zg, Rg, need to one slide
>>>>> discussion
>>>>> before showing the results. (Expressions are in small text size,
>>>>> will
>>>>> not be visible for audiences)
>>>>> Can you please add one slide before slide9-10?
>>>>>
>>>>> Slide12:
>>>>> Can you say something about this weights? Like where and how do
>>>>> you get
>>>>> this?
>>>>>
>>>>> Slide18-19:
>>>>> There are two small plots, not visible at all.
>>>>> Can you please make it bigger and clear, and mention how it is
>>>>> related
>>>>> to your neural network technique?
>>>>>
>>>>> Slide24,25,26:
>>>>> _mention which year pp data?
>>>>> _R=0.4 -> jet resolution parameter (R)=0.4
>>>>> _There are three different eta, (TPC, BEMC, and jet eta); make
>> it
>>>>> clear
>>>>> _ M>1 GeV/c^2 -> Do you use the same cut for unfolding while
>>>>> training
>>>>> simulated from the real data? Or while making response matrix.
>>>>>
>>>>> Slide27,
>>>>> _ I recall, we had a discussion earlier that we need systematic
>>>>> uncertainty along with your statistical uncertainty for these
>>>>> plots in
>>>>> order to validate this closure. Any progress in that direction.
>>>>> _"…centered at the value for 3 iterations " -> Not clear, can
>>>>> you please
>>>>> rephrase this and explain a bit more? I think you have put the
>>>>> statistical bar only in the case of 3rd iteration results. Is
>> that
>>>>>
>>>>> correct? If yes, then mention that stat. Error for other
>>>>> iterations are
>>>>> the same.
>>>>> _"embedding jets into 2 statistically independent samples " ->
>>>>> what are
>>>>> those 2 statistical ind. samples? Need some explanation.
>>>>> _ For your jet pT case, there is a difference at some bins, I
>>>>> think if
>>>>> you use your systematic uncertainties then it would be
>> consistent.
>>>>> Any
>>>>> comment?
>>>>>
>>>>> Slide28:
>>>>> _This slide needs to come after Slide30-31
>>>>> _"Tracking uncertainty " -> "Uncertainty in tracking efficiency"
>>>>> (people may ask you why only -4% not +4%)
>>>>> _I think it is important to show right plot with "STAR
>>>>> preliminary".
>>>>>
>>>>> Slide29:
>>>>> _Remove "Preliminary results:" ; "Fully corrected jet M" make it
>>>>> bigger.
>>>>> _"... but MultiFold also gives us something else!" I think you
>> can
>>>>> drop
>>>>> this and clearly mention what is that "something else"
>>>>> _Be prepared for it if somebody ask any comment on systematic
>>>>> uncertainty comparison between two unfolding methods. Can you
>>>>> please
>>>>> mention here what would be your answer?
>>>>> _ Jet M _expression_ make it bigger; Or just remove it if you add
>>>>> one
>>>>> slide as I commented before.
>>>>> _ I like this plot now.
>>>>>
>>>>> Slide30-31:
>>>>> _Remove "Preliminary results:" ;
>>>>> _You could move these slides before slide29 where you can
>> discuss
>>>>> one
>>>>> projection results of jet M.
>>>>> _"Wider jets tend to have lower |Q|" -> how do you get this
>>>>> conclusion?
>>>>> _ "Different fragmentation patter" -> do you mean it is because
>> of
>>>>> their
>>>>> different jet Mass?
>>>>>
>>>>> Slide33:
>>>>> _"apply boosted decision trees on fully corrected data... " ->
>>>>> what is
>>>>> "boosted decision tree"?
>>>>> _ remover "…" at the end. Or say something what is your plan?
>>>>>
>>>>> Cheers
>>>>> Nihar
>>>>>
>>>>> On 2022-09-30 01:11, webmaster--- via Star-hp-l wrote:
>>>>>> Dear Star-hp-l AT lists.bnl.gov members,
>>>>>>
>>>>>> Youqi Song (youqi.song AT yale.edu) has submitted a material for a
>>>>> review,
>>>>>> please have a look:
>>>>>> https://drupal.star.bnl.gov/STAR/node/61209
>>>>>>
>>>>>> Deadline: 2022-10-11
>>>>>> ---
>>>>>> If you have any problems with the review process, please
>> contact
>>>>>> webmaster AT www.star.bnl.gov
>>>>>> _______________________________________________
>>>>>> Star-hp-l mailing list
>>>>>> Star-hp-l AT lists.bnl.gov
>>>>>> https://lists.bnl.gov/mailman/listinfo/star-hp-l
>>>>> _______________________________________________
>>>>> Star-hp-l mailing list
>>>>> Star-hp-l AT lists.bnl.gov
>>>>> https://lists.bnl.gov/mailman/listinfo/star-hp-l
_______________________________________________
Star-hp-l mailing list
Star-hp-l AT lists.bnl.gov
https://lists.bnl.gov/mailman/listinfo/star-hp-l


--
Sooraj Radhakrishnan
Research Scientist,
Department of Physics
Kent State University
Kent, OH 44243

Physicist Postdoctoral Affiliate
Nuclear Science Division
Lawrence Berkeley National Lab
MS70R0319, One Cyclotron Road
Berkeley, CA 94720
Ph: 510-495-2473



Archive powered by MHonArc 2.6.24.

Top of Page