After MOVE - I propose DUMP/RESTORE format

Bron Gondwana brong at fastmail.fm
Sun Dec 2 18:03:07 EST 2012


Now that MOVE is settling down, I really want to explore a vendor-neutral
DUMP and RESTORE format for IMAP mail stores.

I haven't given up on the IMAP5 concept - though in discussion with Arnt
recently we have both come to the conclusion that another stateful long-lived
protocol is probably not right.  What would be more useful to those of us
trying to build large clustered solutions is a protocol which is more batch
based, so it could be used in a connectionless way, without needing a complex
stateful proxy in the middle.

I'll come back to that.


Anyway - DUMP/RESTORE and Replication.

David (cc'd) did the initial work for replication in Cyrus, and I've built
on that with with new protocol in Cyrus 2.4.  It has some warts though - it
exposes far too much Cyrus implementation specifics.

I want to use the same concepts for a backup format which is durable,
incremental, and can be restored in such a way that a client which
connected again later can not tell that it's connecting to a restored
copy of the mailbox rather than the original mailbox.

I would also change our replication protocol for Cyrus to be based on
making synthetic incremental backups and applying them to the replica -
so that in theory any server which supported the DUMP/RESTORE format
would be able to be a replication source or destination.

Why do I think this is useful?
==============================

For users:
* ability to move between mail providers more easily (just import and
  export mailboxes)

For providers:
* ability to switch server software.
* ability to take efficient incremental backups which can be restored
  exactly as taken (at least in the case of Cyrus, this doesn't currently
  exist - you can't backup an entire mailbox at a point in time without
  having to reconstruct afterwards to clean up the mess - it doesn't do
  snapshots)

For us at FastMail/Opera:
* replace our custom-built Cyrus backup tool which does under-the-hood
  locking magic with something standard.


How Cyrus Replication works:
============================

We take advantage of the MODSEQ values from CONDSTORE/QRESYNC to work
out what needs to be sent to bring the replica into date.  At FastMail
we test our replicas once per week by running a set of IMAP queries
against both the master and replica (repeating multiple times in case
of mismatch so we don't get too many false positives).  These queries
are designed to exercise all the interesting fields, so that we know
the replication protocol is working correctly.

To ensure integrity, we also calculate a "SYNC_CRC" value.  This is the
XOR of a CRC32 for each UID in the mailbox.  The CRC32 for the individual
UIDs is calculated on a string formatted from all the mutable fields of
the message.  This allows efficient updates, because you can just calculate
the old and new values on each change, and XOR them together to get the
new SYNC_CRC value.

In the case of a SYNC_CRC mismatch, we download all the metadata (similar
to FETCH 1:* (UID FLAGS INTERNALDATE MODSEQ)) for every record and
calculate what changes need to be made on the replica to bring it back into
sync.  This is where UID promotion happens if the content of the message
is different.  We do mismatch detection with a GUID field (actually
DIGEST.SHA1 on the RFC822 body)

Here's an example from the wire:

<1354488670<COMPRESS DEFLATE
>1354488670>OK DEFLATE active
<1354488670<SET OPTIONS %(CRC_VERSION 2)
>1354488670>OK success

Creating two mailboxes:

<1354488685<GET MAILBOXES (user.foo user.foo.subdir)
>1354488685>OK success
<1354488685<APPLY MAILBOX %(UNIQUEID fa8673d7-471e-4d5b-9c7f-f3ee64106447 MBOXNAME user.foo LAST_UID 0 HIGHESTMODSEQ 2 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default ACL "foo lrswipkxtecdan  admin   lrswipkxtecd    " OPTIONS P SYNC_CRC 0 RECORD ())
>1354488685>OK success
<1354488685<APPLY MAILBOX %(UNIQUEID b2381db2-2e89-4a35-92d5-5ce55bf9fc4d MBOXNAME user.foo.subdir LAST_UID 0 HIGHESTMODSEQ 1 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default ACL "foo  lrswipkxtecdan  " OPTIONS P SYNC_CRC 0 RECORD ())
>1354488685>OK success

Appending two messages.  First we query the remote end to
see what's there now:

<1354488687<GET MAILBOXES (user.foo)
>1354488687>* MAILBOX %(UNIQUEID fa8673d7-471e-4d5b-9c7f-f3ee64106447 MBOXNAME user.foo LAST_UID 0 HIGHESTMODSEQ 2 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default ACL "foo     lrswipkxtecdan  admin   lrswipkxtecd    " OPTIONS P SYNC_CRC 0)
OK success

Then we try to reserve those messages (this is a waste of
time really, it shouldn't bother trying, that's a bug):

<1354488687<APPLY RESERVE %(PARTITION default MBOXNAME (user.foo) GUID (196922b6d822b618c665874fb523b9058a0adb56 ec4a76ae5e5f772dee837494134c75069286623a))
>1354488687>* MISSING (196922b6d822b618c665874fb523b9058a0adb56 ec4a76ae5e5f772dee837494134c75069286623a)
OK success
<1354488687<APPLY MESSAGE (%{default 196922b6d822b618c665874fb523b9058a0adb56 92}
From: test <test at example.com>
To: test <test at example.com>

Some stuff in the body...
.
 %{default ec4a76ae5e5f772dee837494134c75069286623a 372}
Return-Path: <brong at brong.net>
Received: from local (slot2 [127.0.0.52])
         by test_slot2_4092 (Cyrus git2.5+0) with LMTPA;
         Sun, 02 Dec 2012 23:51:26 +0100
X-Sieve: CMU Sieve 2.4
From: test <test at example.com>
To: test <test at example.com>
Message-ID: <cmu-lmtpd-4251-1354488686-0 at test_slot2_4092>
Date: Sun, 02 Dec 2012 23:51:26 +0100

Some stuff in the body...
)
>1354488687>OK success

And finally, now that the messages are spooled, we update the
mailbox view.

<1354488687<APPLY MAILBOX %(UNIQUEID fa8673d7-471e-4d5b-9c7f-f3ee64106447 MBOXNAME user.foo LAST_UID 2 HIGHESTMODSEQ 5 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 1354488687 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default ACL "foo        lrswipkxtecdan  admin   lrswipkxtecd    hello   lrswipkxtecd    " OPTIONS P SYNC_CRC 31431471 RECORD (%(UID 1 MODSEQ 4 LAST_UPDATED 1354488686 FLAGS (\Flagged) INTERNALDATE 1268029091 SIZE 92 GUID 196922b6d822b618c665874fb523b9058a0adb56) %(UID 2 MODSEQ 5 LAST_UPDATED 1354488687 FLAGS () INTERNALDATE 1354488686 SIZE 372 GUID ec4a76ae5e5f772dee837494134c75069286623a)))
>1354488687>OK success

This is somewhat wasteful - it should be able to cache the remote
state and speculatively calculate a diff against what it expects to
be there and upload that as a single incremental dump.   In the usual
case that the other end hasn't changed, it could just apply that dump
and return the successful result.

NOTE though: SYNC_CRC 31431471.  The remote end calculates that after
applying these changes, and returns "OK" because it matched.

You can also see plenty of the warts there - both the %() syntax and things
like POP3_LAST_LOGIN and OPTIONS which are horribly vendor-specific.

Interesting bits to consider:
=============================

When backing up an entire user, you definitely want message de-duplication.
For FastMail, we delay EXPUNGE for a week, and also back up the EXPUNGEd
messages so that we our "restore from backup" feature can always find all
messages EXPUNGEd in the last week, even if we lose a server.

So this means there needs to be a way to cross-refererence a message in
another mailbox.  Cyrus uses SHA1 as the GUID, and uploads messages with
that identifier first before applying the rest of the changes.  For a
standard I believ we don't want to use a hash that's already on its way out
as the default, so we should look at an alternative to this.

There needs to be a way to read just the essential metadata from a
backup file quickly (that is uidvalidity,lastuid,highestmodseq) to
calculate which UIDs need their data included in the new backup, and
which also need their message bodies or XREFs included - similar to
the "GET MAILBOXES ()" query in the example above.

It also makes sense to have the format allow including an entire user's
mailboxes rather than doing each mailbox individually, since the XREFs
would otherwise be across backup files.

The format needs to both support every piece of data needed for every
current extension, and be extensible enough that new extensions' data
can be added.  Things I can think of immediately are METADATA/ANNOTATION
information, and what we in Cyrus call DELETEDMODSEQ - strictly, the
MODSEQ of the last EXPUNGEd message for which you have forgotten the
metadata.  Without this, you can't efficiently reply to QRESYNC queries,
because you need to tell about every gap in the UID sequence in case
the a message in there went away.

Thanks you:
===========

If you've read this far, thank you!  My goal is a format which captures
every piece of data which is required for any client connecting to the
server to be unable to see that it's a different server than previously
after a DUMP/RESTORE (assuming the server supports the same extensions
of course).  Anything which can be re-parsed from the message RFC822
doesn't belong in this format, only fields which are mutable (like FLAGS),
set externally (like INTERNALDATE), or necessary metadata about the past
(like MODSEQ and friends).

The biggest question facing me up front - what does it look like on the
disk/wire?  The Cyrus protocol at the moment looks almost like IMAP, and
parses almost like IMAP - with the added warts that it uses %() to
designate a list with key/value pairs rather than a list of items, and
it uses %{partition sha1 size} rather than {size+} to designate rfc822
messages.  That is clearly bogus for a generally applicable protocol.

The list of fields to include is quite clear - the only real consideration
is whether to support backups without MODSEQ information in them.  They
make incremental backups a lot harder, since you have to read all the
UID records from the old backup and compare them to the current values to
determine if anything has changed (like naive clients doing FETCH 1:*).
I would like a backup format that can support ANY server though, and be
built over regular IMAP by a standalone tool.

Regards,

Bron.
-- 
  Bron Gondwana
  brong at fastmail.fm



More information about the Cyrus-devel mailing list