Skip to Content.
Sympa Menu

sphenix-magnet-l - Re: [Sphenix-magnet-l] It's the "watch dog timer" (which open the dump resistor contactor) !!!

sphenix-magnet-l AT lists.bnl.gov

Subject: sPHENIX discussion of the superconducting solenoid

List archive

Chronological Thread  
  • From: "Yip, Kin" <kinyip AT bnl.gov>
  • To: "sphenix-magnet-l AT lists.bnl.gov" <sphenix-magnet-l AT lists.bnl.gov>
  • Cc: "Sandberg, Jon N" <jsandberg AT bnl.gov>, "Schoenfeld, Ralph" <ralphs AT bnl.gov>, "Morris, John" <jtm AT bnl.gov>, "Costanzo, Michael" <mcostanzo AT bnl.gov>
  • Subject: Re: [Sphenix-magnet-l] It's the "watch dog timer" (which open the dump resistor contactor) !!!
  • Date: Tue, 20 Jun 2023 02:18:21 +0000

Hello,

 

I communicated with Carl, Chung and Ralph after my last two emails in the last few days. 

On Tuesday (June 20), we’ll start to

 

  1. replace the bad cable ;  Carl prefers them not to be IDE type but of solder type , if possible.
  2. fix/improve the Labview program for the PLL-loss matter.  Hopefully, Chris & Chung can help us with this.

 

Our sPHENIX colleagues are hoping that maybe, we can turn the magnet back to full-field after Wednesday’s Maintenance Day (12 hours).   I am not sure whether this can be

achieved but it may be a good goal  

 

Kin

 

From: Yip, Kin
Sent: Sunday, June 18, 2023 10:45 AM
To: Ho, Chung <chungh AT bnl.gov>; Degen, Christopher <degen AT bnl.gov>; Costanzo, Michael <mcostanzo AT bnl.gov>; Bachek, Paul <pbachek AT bnl.gov>
Cc: Schultheiss, Carl <carls AT bnl.gov>; Joshi, Piyush <joshi AT bnl.gov>; Schultheiss, Carl <carls AT bnl.gov>; Than, Yatming (Roberto) <ythan AT bnl.gov>; Mills, James A <mills AT bnl.gov>; Rosas, Pablo J <rosas AT bnl.gov>; Sandberg, Jon N <jsandberg AT bnl.gov>; Tallerico, Thomas N <tallerico AT bnl.gov>; Haggerty, John <haggerty AT bnl.gov>; Feder, Russell <rfeder AT bnl.gov>; O'Brien, Edward <eobrien AT bnl.gov>; Morris, John <jtm AT bnl.gov>; Pekrul, Winston <wpekrul AT bnl.gov>
Subject: It's the "watch dog timer" (which open the dump resistor contactor) !!!

 

Hello, Carl and all,

 

Up to last Friday, I’ve always mistakenly treated your “Watch Dog Timer” as some (unnecessary) companion to the quench-interlock signal.

 

I’ve been thinking since Friday … Yesterday late afternoon or evening, I started to remember seeing the “watch dog timer” being >1 second earlier than “quench detector” (interlock) signal on the last 4600 A crash on June 11. Then, I realized that last Fri., I/we didn’t check the PLC when we observed the delay of the Labview stopping after we heard the “loud” sound of dump-resistor contactor closing.  I realized that it could be ~15 s late (as we were “playing” with 15 s Timeout/delay accidentally).  

 

This morning, I couldn’t wait any more and I just drove to 1008B to look at the PLC history (in front of the Magnet Power Supply).

 

I include a screenshot of the PLC history on June 16 (last Friday) when we were experiencing the 15 s delay.

You can see that there were 2 such instances.  Please ignore the lines with the status of “Fault Cleared” which was

when I cleared the interlocks on June 16.  ( Just look at the lines of “Fault”. )

 

But for both instances, you can see the “Quench Watch Dog Timer” is ~14 or 15 s earlier than “Quench Detector” !!

The “Dump Resistor Contactor Opened” happened at the same second as the “Watch Dog Timer” (though the screen is not big enough to capture the earlier “Dump … Opened”).

 

I also attach the screenshot of June 11 (last 4600 A crash).  Then, at 9:57 am, the “Watch Dog Timer” actually appeared actually at 9:57:28 before “Dump … Opened” at 9:57:29 and “Quench Detector” at 9:57:30.   ( On June 7, we changed the code at that time to prevent the PLL-LOSS from stopping the program but there was still the 1 or 0.9xx second delay. )

 

It seems clear to me that the “quench-detector (interlock) signal” really didn’t get to the PLC/Power-Supply to initiate the fast discharge but it’s your “Watch Dog Timer” (that I have mentally ignored until now) that had opened the Dump Resistor Contactor.   The quench-interlock delay was not just on the logged data but real.  It’s the “Watch Dog Timer” that has “saved the day”.

 

Your “Watch Dog Timer” is probably a bit similar to the PLL-loss check in the Labview but it doesn’t have the “Timeout”/delay.   We don’t have “Watch Dog Timer” in the logged data but we should add ?     

 

Speculation/wishful thinking:  maybe, the “Watch Dog Timer” could be the signal that went down to open Dump Resistor Contactor before everything else ?  In some instances, the delay due to the PLL-loss check was long enough (when the Labview couldn’t read the clock signal) after “Watch Dog Timer” already opened Dump Resistor Contactor that “neg_90_deg_offset_jnt” changed enough (>50 mV) to cause the Labview to stop (and finally pull the “interlock” signal).   This wouldn’t happen for the 100-150 A test as there wasn’t enough change, but it would for 4600 A.   {  Maybe, Chung and Chris can help us figure out. }

 

 

Kin

 

 

From: Yip, Kin
Sent: Friday, June 16, 2023 8:10 PM
To: Ho, Chung <chungh AT bnl.gov>; Degen, Christopher <degen AT bnl.gov>; Costanzo, Michael <mcostanzo AT bnl.gov>; Bachek, Paul <pbachek AT bnl.gov>
Cc: Schultheiss, Carl <carls AT bnl.gov>; Joshi, Piyush <joshi AT bnl.gov>; Schultheiss, Carl <carls AT bnl.gov>; Than, Yatming (Roberto) <ythan AT bnl.gov>; Mills, James A <mills AT bnl.gov>; Rosas, Pablo J <rosas AT bnl.gov>; Sandberg, Jon N <jsandberg AT bnl.gov>; Tallerico, Thomas N <tallerico AT bnl.gov>; Haggerty, John <haggerty AT bnl.gov>; Feder, Russell <rfeder AT bnl.gov>; Ho, Chung <chungh AT bnl.gov>; O'Brien, Edward <eobrien AT bnl.gov>; Morris, John <jtm AT bnl.gov>; Pekrul, Winston <wpekrul AT bnl.gov>
Subject: bad ribbon cable ... and quench-interlock-signal in Logged data is delayed by the "Timeout" but not to power supply

 

Hello, Chung and Degen (and all),

 

This email has two subjects and the first one may be the answer to the question Chung, Degen, Carl and I have been wondering in the last couple of days. 

 

  1. Though we put in longer delay, the failure seems to be more frequent.   This afternoon, Carl started to suspect ribbon-cables (after talking with Chris) from the PIXe to the chassis above.  His first effort to re-sit the cable didn’t last very long before Labview “stopped” due to PLL-LOSS again.   After Carl came with his technician, they tried to manipulate/touch the cables.  Long story short, at one point, the Labview stop/crash was immediate and after you touched or bent the cable a bit, the immediate Labview stop didn’t happen again.


I was a bit scared by their manipulation with the cable and at one point, Labview stopped (not due to PLL-LOSS but  due to open-lead for a signal belonging to a voltage tap) after Carl shook the cable quite forcibly.


But at the end, it seems Carl and his tech have shown that it’s the intermittent cable which has caused recent problem.  So, the Labview stopped because the intermittent wire was

suddenly opened (or otherwise bad) and Labview just couldn’t read the data any more at that time !   Only one cable contains “clock” signal and we swapped the bad one with the other one (as the other one doesn’t carry or need the clock signal).   Labview has been running OK.  Let’s wait to see …

Ralph Schoenfeld told us that he didn’t seem to have a spare set though he’s got the cables and connectors.  Carl plans to change all 3 cables/connectors next week.

  1. In the exercise above, we have observed another important “feature”.   We put in 5 s (Timeout) for all 3 boards today but when I came to check after Labview stopped (and magnet current dropped),  the “loop” time was ~15 seconds, which you can’t see at the top-left of the Labview picture “20230616_155020.jpg”.  It’s the total Timeout is 3*5 = 15 😊

    Carl didn’t quite believe at first.  That’s why I took picture to show.   And more importantly, when we looked at the LogView data as the “gif” file attached, I saw that the quench-interlock signal was 14.5 seconds behind the time that the magnet current started to drop and other voltage tap signals started to change !   We knew that the quench-interlock for our tests the last few days had to come first.

    It took me a minute to convince Carl.   And then, by chance, we experienced it in real time.   While we were talking (in 1008B mezzaninie) and Labview program was running, we heard a loud “boom” sound from the contactors in the Dump Resistor => quench-interlock happened.  But when we looked at the Labview program instantly, Labview didn’t stop (red) immediately but stayed “green” for a bit… and stopped only  like ~15 seconds after the loud “boom” sound !!!    So, we observed that the quench-interlock happened whereas Labview didn’t the record of “quench-interlock” signal until the Timeout.

 

As I said, this is clearly shown in the “gif” file of the LogView !   We noticed this comparatively easily this afternoon because the Timeout was 15 s !!!


This indicates that the quench-interlock signal depends on the Timeout in the Labview.  The original Labview in Timeout was calculated in Zeynap’s program (that’s the case during the crash on June 2) and it’s difficult to tell how long the “Timeout” would be.   ( We forced the PLL-LOSS not to stop the program on June 7. ) Since June 12, we changed the Timeout to a constant and we tried 0.5, 1, 5 s
😊  

 

Therefore, our observation that the voltage-tap signal changed before the quench-interlock might not be true ?!  🤔  This may have important implication ?!

( At the moment, we set it back to 1 s. )


Kin

 

 

From: Ho, Chung <chungh AT bnl.gov>
Sent: Friday, June 16, 2023 1:01 PM
To: Degen, Christopher <degen AT bnl.gov>; Costanzo, Michael <mcostanzo AT bnl.gov>; Bachek, Paul <pbachek AT bnl.gov>
Cc: Yip, Kin <kinyip AT bnl.gov>; Schultheiss, Carl <carls AT bnl.gov>
Subject: RE: sPhenix quench detection LabVIEW

 

 

 

From: Degen, Christopher <degen AT bnl.gov>
Sent: Friday, June 16, 2023 12:58 PM
To: Ho, Chung <
chungh AT bnl.gov>; Costanzo, Michael <mcostanzo AT bnl.gov>; Bachek, Paul <pbachek AT bnl.gov>
Cc: Degen, Christopher <
degen AT bnl.gov>
Subject: Re: sPhenix quench detection LabVIEW

 

 

How could we have under-run our sample buffer with a 5.0 second timeout???

It’s almost as if our clock stopped.

 

That is the Questions we need the answer

 

 

Should the sample clock source be set to Slot2/PFI1 for all three boards? I’m not sure.

I believe all u need is connected it to one of the board .

-Chris

 

From: Ho, Chung <chungh AT bnl.gov>
Date: Friday, June 16, 2023 at 12:40 PM
To: Degen, Christopher <
degen AT bnl.gov>, Costanzo, Michael <mcostanzo AT bnl.gov>, Bachek, Paul <pbachek AT bnl.gov>
Subject: RE: sPhenix quench detection LabVIEW

 

 

From: Degen, Christopher <degen AT bnl.gov>
Sent: Friday, June 16, 2023 12:30 PM
To: Ho, Chung <
chungh AT bnl.gov>; Costanzo, Michael <mcostanzo AT bnl.gov>; Bachek, Paul <pbachek AT bnl.gov>
Cc: Degen, Christopher <
degen AT bnl.gov>
Subject: sPhenix quench detection LabVIEW

 

I have the following observations/questions (incoherent ramblings?) about 24-Ch Loop in Loop v14.vi. The software is complicated, and I don’t have a good understanding of how it works yet.

 

There appear to be different sample clock configurations for the three analog input cards. Is this intentional?

 

 

 

Carl forgot do all 3 I will fixed it

 

 

 

Do these values ever get out of sync?

Don’t know  Charlie told me just watch the Avg loop time < 17 ms

 

 

 

This morning’s error seem to indicate that all 3 boards maintained sync, yet we under-ran the buffer. I’m interested in the history of these values prior to a failure. They could be plotted in a waveform chart, to provide a simple post mortum of their values.

 

 

 

Why do we read the three analog input boards sequentially, as opposed to simultaneously?

Don’t know Design by Charlie and Zenyep

 

Also, the timeout for the read of board 1 is 5.0 seconds, and board 2 & 3 are 0.1 seconds. This probably doesn’t matter due to the sequential board reads.

 

Yet we appear to have under-run the sample buffer, even with a 5.0 second timeout! What happened to our sample clock??

Remain the same 730 Hz

 

How long did this run until it failed?

Somtime   days  sometime hours

 

Regards,

Chris

 

Christopher M. Degen

Brookhaven National Laboratory

Building 924

Upton NY 11973 USA

mailto:degen AT bnl.gov

Voice: (631) 344-2492

 

 

PNG image

PNG image

PNG image

PNG image

PNG image




Archive powered by MHonArc 2.6.24.

Top of Page