Skip to Content.
Sympa Menu

sdcc_users-l - [[Sdcc_users-l] ] Portions of the Shared Condor pool are down

sdcc_users-l AT lists.bnl.gov

Subject: Scientific Data & Computing Center

List archive

Chronological Thread  
  • From: SDCC Announcements <announce AT rcf.rhic.bnl.gov>
  • To: rcfstaff AT bnl.gov, sdcc_users-l AT lists.bnl.gov
  • Subject: [[Sdcc_users-l] ] Portions of the Shared Condor pool are down
  • Date: Sat, 22 Mar 2025 15:44:40 -0400



Summary:
Contact was lost with approximately half of the SL7 hosts in the shared
HTCondor pool at 11:25am.

Effective Time(s):
3/22/2025 11:25 am - 3/22/2025 10:00 pm

Group Responsible:
IT Fabric

Affected Area(s):
HTCondor Shared Pool

Expected User Impact:
A portion of the compute farm is unavailable

Maintenance Type:
Unplanned/Outage



Description:
About half of the SL7 hosts on the shared HTCondor pool (~7K job slots)
experienced an outage at approximately 11:25am today. The Alma 9 hosts were
apparently unaffected.
Experts are on site investigating and will update further as the situation
evolves. Jobs submitted to SL7 hosts may be delayed due to limited resources
until service is fully restored, and some may need to be restarted.

SDCC Announcements page:
https://www.sdcc.bnl.gov/news-events/sdcc-announcements
Downloadable calendar invite of this event (.ics format):
https://www.sdcc.bnl.gov/announcements/make_ics.php?evt=1742672680

This item has been posted to RCF/USAtlas Staff, SDCC Users

--
This message has been forwarded from the SDCC announcements page.
Recent messages are available at:
https://www.sdcc.bnl.gov/news-events/sdcc-announcements
________________________________________________________________



  • [[Sdcc_users-l] ] Portions of the Shared Condor pool are down, SDCC Announcements, 03/22/2025

Archive powered by MHonArc 2.6.24.

Top of Page