Using xfer to migrate mailboxes to a new server

David Mayo D.J.Mayo at bath.ac.uk
Tue Feb 16 06:29:43 EST 2010


Hi all,

We are in the process of moving our mailboxes from a Cyrus 2.2 server to 
a Cyrus 2.3 server. I have several comments regarding this which I was 
planning to share with the list - so this seems an opportune moment.

Simon Fraser wrote:

> On Mon, 2010-02-15 at 15:25 +0200, Elver Loho wrote:
> 
>> localhost.localdomain> xfer user.elver 192.168.0.180
>> xfermailbox: Server(s) unavailable to complete operation
> 
> There should be a bit more information in the log, too. While debugging
> the authentication and permissions I found the error messages put into
> syslog were more helpful than the program output. 

Some additional things to look out for:

- You must run this command on the existing back-end
- The mailbox *must* have a quota set
- All your databases must be the same type between old and new server

You can also enable detailed telemetry logging by creating a directory 
"log/cyrus" on the destination server (or whatever principal you're 
using to authenticate to the destination server).

>> So I'm wondering if anyone here has any experience with getting xfer
>> to work in a migration scenario that they'd be willing to share.
>> Anyone?
> 
> Once the transfer of test mailboxes worked, I had exim set to defer
> messages if a particular flag was set, and had a perdition proxy
> checking to see if the user was on the old server, the new, or deferred,
> to prevent any chance of mailbox modification attempts during the
> transfer. 

We are using a Murder environment for which the documentation says that 
the front ends know when a mailbox is being transferred and they will 
delay delivery during the process, so we haven't set up any special Exim 
rules. I do have a "freeze list" for any accounts we have to manually 
copy, which sets "control = freeze" and holds the message on Exim's queues.

> Some other things I saw:  xfer will move the sieve script for you.

It also sets up the quota, and moves the user's subscriptions and seen 
state (don't underestimate the users' fondness for their seen flags!). 
It's brilliant! (when it works)

> If the transfer fails for whatever reason, the old server will have a
> record in mailboxes.db to say which host that mailbox is now on, but
> will be unable to perform any operation on it.  This is particularly
> annoying as xfer is recursive, so if a sub-folder fails, one can't
> restart the transfer for the inbox, only the other subfolders
> individually.

Our solution for recovering from these errors is:

- freeze incoming mail for the recipient (also remove ACLs if you don't 
want the user to delete or move emails around)
- rsync the emails, sieve scripts, subscriptions, seen state
- rename the mailbox on the old server and create the mailbox on the new 
server
- run ctl_mboxlist -ma on the destination server to activate the mailbox 
that was in transit
- reconstruct -r -f the mailbox on the new server to discover any sub 
folders

This is slightly more complicated if the top level mailbox fails because 
the new back-end thinks it has the mailbox and the MUPDATE master 
doesn't, which means the top level mailbox on the new back-end cannot be 
created as usual. To fix this, you need to run ctl_mboxlist -ma before 
creating the top level mailbox.

The main reasons we had for failures are because one of the emails could 
not be read by the cyrus user (we went through a period of virus 
scanning our mail store and retrospectively "quarantining" the file as 
root). If xfer fails to copy one of these files it bombs out and you're 
left in limbo. Having said that, everything will continue to work fine 
as long as it's not the top level mailbox itself that failed: if only 
sub folders are moved, these work fine across different servers.

> It's also worth remembering that you can run xfer jobs in parallel.

We tried doing this but our first big parallel transfer resulted in the 
destination IMAP server crashing! We saw lots of these lines:

Jan 23 04:06:47 sauber.bath.ac.uk imap[4434]: [ID 602473 mail.error] 
IOERROR: lock_shared /opt/etc/imapd/quotas.db: Bad file number

Eventually resulting in *lots* of these lines:

Jan 23 08:10:32 sauber.bath.ac.uk imap[4434]: [ID 362402 mail.error]
skiplist: version mismatch: /opt/etc/imapd/quotas.db has version
2.1264205870
Jan 23 08:10:32 sauber.bath.ac.uk imap[4434]: [ID 558109 mail.error]
skiplist: closed while still locked
Jan 23 08:10:32 sauber.bath.ac.uk imap[4434]: [ID 729713 mail.error]
DBERROR: opening /opt/etc/imapd/quotas.db: cyrusdb error
Jan 23 08:10:32 sauber.bath.ac.uk imap[4434]: [ID 637875 mail.error]
Fatal error: can't read quotas file

Users couldn't log in, mail wasn't being delivered and I couldn't even 
run "ctl_mboxlist -d". We had to regenerate the quotas DB from scratch 
and reconstruct some mailboxes that were suddenly using 3,000% of their 
quota! If anyone can suggest why the quotas DB got corrupted I'd be very 
interested in knowing. Running xfer one at a time doesn't produce these 
problems.

>> I'm sure I'm doing something wrong here. I might even be doing
>> everything wrong here. I've googled and it seems like xfer is a common
>> headache for Cyrus admins with no easy solution. However, it also
>> seems to be a particularly bad headache -- most old e-mails I've found
>> about this topic seem to have gone unanswered.
> 
> I don't claim to be an expert at xfer, since I've only used it to
> migrate to new hardware once, but hopefully it's helpful anyway. I moved
> approximately 2000 mailboxes, scripting against the Cyrus::IMAP perl
> module.  About a dozen had some sort of transfer error, and these were
> fixable with no loss of email.

If you're moving from one Cyrus server to another I think it's 
definitely the best way forward - especially if you're in a Murder 
environment, as xfer handles all the mailbox reservations too. 
Unfortunately the xfer code doesn't seem as robust as the rest of the 
Cyrus distribution so we've had several false starts in transferring our 
mailboxes. The Cyrus 2.3 xfer code may be better but I haven't tried it.

Regards,


Dave.

David Mayo
Networks/Systems Administrator
University of Bath Computing Services, UK


More information about the Info-cyrus mailing list