data center cooling issue

Gregory Houchins ghouchin at andrew.cmu.edu
Wed Apr 22 22:02:49 EDT 2020


Cluster has returned to service.

-Greg

On Wed, Apr 22, 2020 at 7:38 PM Gregory Houchins <ghouchin at andrew.cmu.edu>
wrote:

> All jobs have been cancelled and all compute nodes shutdown and login to
> the headnode will be restricted to prevent hardware damage.
>
> --
> Gregory Houchins | WH3402 | 412-268-2486
> Arjuna System Administrator
> PhD Candidate, Physics
> Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213
>
>
> ---------- Forwarded message ---------
> From: Bryan Webb <webb at psc.edu>
> Date: Wed, Apr 22, 2020 at 7:07 PM
> Subject: data center cooling issue
> To: CMU-CoE GPU temp email <coep80 at psc.edu>
> Cc: Clint Perrone <perrone at psc.edu>, Ed Hanna <ehanna at psc.edu>
>
>
> Folks,
>
> We appear to be experiencing cooling problems in the data center at the
> moment.  We are awaiting more details from a facility engineer, but nodes
> on Bridges are already overheating.
>
> It may be best for you to shutdown your clusters as much as possible as
> soon as possible from your remote management capabilities.
>
> ..Bryan
>
> --
> Bryan R. Webb, Systems and Facilities Administrator
> Pittsburgh Supercomputing Center (Carnegie Mellon University)
> office: 412-268-5134
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.andrew.cmu.edu/mailman/private/arjuna-users/attachments/20200422/9861b551/attachment.html>


More information about the Arjuna-users mailing list