Arjuna Update

Alexander Bills abills at andrew.cmu.edu
Tue Sep 7 17:14:13 EDT 2021


Hello,

We have not been able to recover data from the RAID storage and have
exhausted all recommended solutions from PSC, CMU ECE, and others. At this
time we do not believe that the data is recoverable.

After an extensive investigation, we determined that the cause of the data
loss was a reformat of the /home filesystem. We changed the configuration
of the worker nodes to mount their drives (previously all OS and files on
workers were stored in RAM and the drives were unused) in an attempt to
free RAM space and to alleviate crashes caused by users filling the /tmp
file system. c003 had a connection to the RAID system, which we believed
was inactive, but which became active upon this change, and c003
reformatted the RAID storage rather than its internal disk. During a power
fluctuation at PSC on Tuesday, c001 briefly lost connectivity to the RAID,
and upon remount, was unable to reconstruct the reformatted filesystem.

We attempted to reconstruct the filesystem using all tools recommended to
us, ranging from simply rebuilding the partition table to “forensics” tools
typically used by law enforcement to reconstruct destroyed evidence.
However, we were unable to recover any missing data. The connection from
c003 to RAID was removed, ensuring that this incident will not be repeated.

While we do not anticipate any more major changes of this type to the
Arjuna configuration, we strongly reiterate our urging for all users to
backup all important data on all machines.

We will return Arjuna to service this week, and at that time will provide
instructions on how to request a new account.

Best Regards,

the Arjuna Admin Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.andrew.cmu.edu/mailman/private/arjuna-users/attachments/20210907/5fa693f3/attachment.html>


More information about the Arjuna-users mailing list