Arjuna
Alexander Bills
abills at andrew.cmu.edu
Sun May 9 16:13:01 EDT 2021
Hi All,
As many of you have undoubtedly noticed, Arjuna is experiencing some issues
this weekend. Currently we are working to return it to full service as
quickly as possible, and we appreciate your patience while the issues
persist. Unfortunately, all running jobs were lost and the queue was
cleared. Many nodes have returned to service, however other nodes remain
down. If the problem occurs again, it is likely that the queue will be
cleared again and all jobs will again be lost.
A few reminders on using Arjuna:
(1) Please do not run any computationally expensive programs on the
headnode.
(2) Please be careful with scripts that submit jobs, and with testing
scripts that submit jobs. Submitting too many jobs was likely a
contributing factor in this incident.
(3) If you have any jobs that are being held in the queue due to errors
(i.e. JobHoldMaxRequeue), please cancel those jobs within a reasonable time
frame.
Thank You,
Alec Bills
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.andrew.cmu.edu/mailman/private/arjuna-users/attachments/20210509/5cc6828a/attachment.html>
More information about the Arjuna-users
mailing list