DR Plan Question

Michael F. Sprague msprague at dcu.org
Mon Dec 13 09:05:10 EST 2004


Jason Jacobsen said:
> I am working on a DR solution for our mail systems. We have two systems
> with virtually identical configurations and each will have a DR server.
> Our mail system runs on Red Hat EL 3.0 and uses the postfix mta,
> cyrus-imapd, and mysql for authentication.  We replicate the production
> server to our backup server nightly using rsync.  The basic plan I have
> for DR is to have each production mail server queue mail for the other in
> the event the server is not reachable.  We can then make an internal dns
> change to allow the mail to be redirected to the dr machine in the event
> of a failure.  Also at this point we will redirect the external ip of the
> production mail server to the internal ip of the dr machine on our
> firewall.  This will effectively roll us over to the dr box in the case
> that the production sever fails without losing new messages sent after
> failure.  The problem I am facing is when the production server is
> repaired I would need to synchronize the mail from the dr machine to the
> production machine to ensure no mail is lost.  Some of our users use imap
> and some use pop3.  Thank you in advance for any suggestions and let me
> know if there is any other information I can provide.

Hi Jason,

We're are working on a similar issue here.  Though I haven't written a
script to do it yet, I think I have the logic worked out.  Sharing that
will hopefully help you out. :)

Since syncs with the DR server are not instantaneous, the situation will
arise where there is email on the production server and the DR server that
you want to keep in tact.  However, a simple rsync from the DR server to
the production server won't work because there is a good chance that
messages have been delivered on the DR server that are using message
numbers that already exist on the now restored production server.

I think the safest way to proceed would be to create a tarfile on the DR
server that consists of any messages still on the DR server that have been
delivered since the start of the disaster.  The tarfile can be SSHed to
the production server and restored to temporary space.  Then run a script
that will traverse the temporary directory tree.  For each mailbox,
examine the messages in it and compare to the same mailbox in the
production tree.  You want to find the highest numbered  message in each
folder of the mailbox.  If the message numbers in the restored mailbox are
all higher than this number, then you can just copy them to the production
mailbox and do a rebuild of that mailbox.

However, if the message numbers of the restored mailbox are smaller than
this number, then you'll need to either re-inject the message to the
mailbox, probably by using 'deliver' or the Cyrus Perl modules, or
renumber the file and copy it to the mailbox and do a rebuild.  I still
need to experiment and see which is the better method or will even work.
:)

There will be some 'duplicates' but I think that is a small price to pay
to not lose any messages.  At least from the user's point of view.

Well, that's the basic logic as I see it for this situation.  Sorry I
couldn't offer a working solution, but I hope this helps.

M

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html




More information about the Info-cyrus mailing list