gilgamesh: gilgamesh update

John Kitchin jkitchin at andrew.cmu.edu
Sun Nov 10 16:52:07 EST 2013


Hi everyone,

I wanted to give you an update on gilgamesh. It seems that last week
another drive failed in the home directory raid array, putting the raid
array in a degraded state and causing some issues.

Today I was able to figure out which drive caused that, replace that drive
and the array is presently rebuilding. that will probably take overnight,
and if all goes well I plan to reenable logging in to gilgamesh tomorrow.
If the system is stable for another day, I will turn the queue back on, and
slowly start turning on nodes.

I do  not believe any data has been lost, but I will take this opportunity
to remind you that the home directories are not backed up.

I also do not know if there is any reason for two drives to have failed
close together. The disks are a little over 2 years old, which seems young
to die, but they also have not been used much in the past 2 years until we
made them the new home directories.

John

-----------------------------------
John Kitchin
Associate Professor
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
http://kitchingroup.cheme.cmu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.andrew.cmu.edu/mailman/private/gilgamesh-users/attachments/20131110/422f8e76/attachment.html 


More information about the gilgamesh-users mailing list