sphenix-software-l AT lists.bnl.gov
Subject: sPHENIX discussion of software
List archive
[[Sphenix-software-l] ] Fun4All file lookup switched to read only replicas
- From: pinkenburg <pinkenburg AT bnl.gov>
- To: "sphenix-software-l AT lists.bnl.gov" <sphenix-software-l AT lists.bnl.gov>
- Subject: [[Sphenix-software-l] ] Fun4All file lookup switched to read only replicas
- Date: Wed, 9 Oct 2024 10:13:33 -0400
Hi folks,
this is a heads up for a change in the way Fun4All looks up the file locations if you just give it a filename. Until now it used the master DB even though it just reads it. This causes unnecessary load and we have read only replicas which - unlike servers with write capabilities - we can easily scale up (scaling DB writes is a real can of worms and we want to avoid this). This should be transparent - I have been testing this for a while with our simulation production and haven't seen any issue.
But we do have cases with other DB's where the request to the replica times out (or collides with an update - not sure what the exact problem is) and fails.
Here is an example from perl, but since it also uses odbc at the backend I assume problems in Fun4All will give similar messages:
DBD::odbc::st execute failed: ERROR: canceling statement due to conflict with recovery User query might have needed to see row versions that must be removed.; Error while executing the query (SQL-40001) at CreateRawfileList.pl line 64. DBD::odbc::st execute failed: ERROR: canceling statement due to conflict with recovery User query might have needed to see row versions that must be removed.; Error while executing the query (SQL-40001) at CreateRawfileList.pl line 64. DBD::odbc::st execute failed: ERROR: canceling statement due to conflict with recovery User query might have needed to see row versions that must be removed.; Error while executing the query (SQL-40001) at CreateRawfileList.pl line 64. DBD::odbc::st execute failed: FATAL: terminating connection due to conflict with recovery User query might have needed to see row versions that must be removed.; Error while executing the query (SQL-40001) at CreateRawfileList.pl line 64. DBD::odbc::st execute failed: Could not send Query(connection dead); Could not send Query(connection dead) (SQL-42703) at CreateRawfileList.pl line 64.
If you see any of that, please let me know. As you may know we will get another 140k cores for the next run and we have to make sure that what we are doing scales. If we run into problems with our DB usage with "only" 66k cores we'll better learn this now.
Thanks
Chris
--
*************************************************************
Christopher H. Pinkenburg ; pinkenburg AT bnl.gov
; http://www.phenix.bnl.gov/~pinkenbu
Brookhaven National Laboratory ; phone: (631) 344-5692
Physics Department Bldg 510 C ; fax: (631) 344-3253
Upton, NY 11973-5000
*************************************************************
- [[Sphenix-software-l] ] Fun4All file lookup switched to read only replicas, pinkenburg, 10/09/2024
Archive powered by MHonArc 2.6.24.