Cyrus Replication (example) [was Re: restore from cyrdump]

Nic Bernstein nic at onlight.com
Fri Dec 19 10:31:00 EST 2014


On 12/19/2014 06:17 AM, Patrick Goetz wrote:
> Nic,
>
> Thanks for that detailed explanation.  I still feel myself somewhat
> stymied by either the documentation (or lack thereof) or perhaps an
> unfortunate case of being somewhat feeble-minded.  Here are some follow
> up comments/questions:
>
>
> On 12/18/2014 9:59 AM, Nic Bernstein wrote:
>> I will say that the ability to quiesce the application without halting
>> it would be most desirable.  Most databases have supported this sort of
>> thing for ages, and it would be great if one could send a signal to
>> Cyrus to achieve the same result.
> I wonder what would happen if you just stopped lmtp while making a
> snapshot?  Would postfix choke on this and start kicking messages back
> to the sender, or would they get queued for later delivery?
> Alternatively, maybe lmtp could temporarily divert new messages to a
> dummy spool so that postfix/sendmail wouldn't have to know anything
> about this.  This might be the least painful way to implement quiescence
> in cyrus.

But LMTP is only one method affecting the mail store, IMAP and sieve can 
as well.  Granted one can brute-force this by shutting down network 
ports and the like, but at that point why not just stop cyrus?

>   > His initial suggestion -- stop cyrus, snapshot, restart cyrus -- is
>   > reasonable, but we feel that the later suggestion -- stop cyrus, tar
>   > up data, start cyrus -- is not.  It takes data offline for too long.
>   > That's why the snapshot capability is necessary in any truly suitable
>   > server.
>
> I agree.  Here is a substitute proposal (and I'll come back to why I'm
> pushing this point).  Serially
>
>     1. rsync user mail files
>     2. rsync configdirectory db files
>     3. rsync user mail files again
>
> That should get you reasonably close to what you get with snapshots.

No, not in the least is this close to a snapshot.  Snapshots are 
instantaneous, or near to it.  The time an rsync takes, even a catch-up, 
grows with the size of the mail store and the deltas between attempts.  
Also, rsync is not well suited to the file-per-message, 
directory-per-mailbox storage scheme of cyrus, as lots of fstats() 
result, and this just adds to the time.

I don't understand why one wouldn't use snapshots?  Every modern OS and 
distro include filesystems or volume managers which support 
snapshotting, and several, such as Ubuntu, even recommend 
snapshot-capable partitioning schemes out of the box.  It's just not 
that hard, and it's exactly the right way to handle this sort of staged 
backup.

  * Halt cyrus
  * snapshot critical filesystems
      o spool date (/var/spool/imap)
      o config data (/var/lib/imap or /var/imap)
      o metadata (i.e. /var/run/cyrus)
  * start cyrus
  * mount snapshot
  * rsync or otherwise backup from snapshot
  * unmount snapshot
  * (optionally) destroy snapshot

This is so easy to handle via a cron or at job.  Why would one do this?  
If the answer is "legacy system," then fine, but legacies can be 
upgraded or replaced.

> If you follow the prescribed cyrus directory structure, then this can be
> simplifed (Arch linux example):
>
>     1. rsync -a --delete /var/imap/user [removable disk/other server]
>     2. rsync -a --delete /var/imap   [removable disk/other server]
>
> Once you've rsynced the mail files once, rsyncing them again a short
> time later should be pretty fast.  There does need to be a backup
> solution for people who only have one server, hence can't use
> replication or imapsync to do backups.

There is, snapshots, or hosted mail services (like Fastmail :).

>> Lastly, as to the use of imapsync to achieve user, mailbox or server
>> replication,...
>>
>> So your command line is much like Patrick's example, but with '--user1
>> <user> --authuser1 <proxyuser> --user2 <user>...'
>> Of course you must create a proxy user, and Cyrus supports this with the
>> 'proxyserver' directive in imapd.conf (man imapd.conf for details),
>> i.e.: 'proxyservers:    proxyuser'.
> Here is the imapd.conf man page entry for proxyservers:
>
>     proxyservers: <none>
>       A list of users and groups that are allowed to proxy for other
>       users, separated by spaces. Any user listed in this will be
>       allowed to login for any other user: use with caution. In a
>       standard murder this option should ONLY be set on backends.
>       DO NOT SET on frontends or things won't work properly.
>
> That capitalized "DO NOT SET on frontends" would seem to be cause for
> concern, especially since I don't understand how this works.

Well then, get thee to a website or man page. :-)
     http://cyrusimap.web.cmu.edu/docs/cyrus-imapd/2.4.17/ag.php

No, seriously, this isn't an issue if you're not using a murder.  A 
"frontend" is the part of a murder aggregation cluster which proxies for 
the backend servers which actually hold the mail store.  A murder 
consists of one or more frontends, one or more backends and a single 
"mupdate" master, which controls the canonical copy of the mailboxes 
database.  In a murder, if one wants to set the proxyservers option, one 
sets it only on the backend machines.

The proxyservers option is exactly the right way to do this.

> For people who are
>    1. imapsync'ing between machines both behind a firewall
>    2. using saslauthd with pam
>
> I thought of this solution:  Temporarily block port 143 traffic on the
> outward facing port of your firewall, and then add the line
>
>     auth  sufficient  pam_permit.so
>
> to the top of /etc/pam.d/imap files on both the sending and receiving
> imap servers.  This should allow you to imapsync the mail stores for
> every user without having to provide passwords.  Once you're done,
> simply remove these lines from the PAM configuration files and unblock
> the port on the firewall.  Yes, this will mean that users won't be able
> to access their mail from outside the firewall while the imapsync is in
> operation, and this is probably only workable for smaller organizations
> where people are not concerned about their coworkers temporarily being
> able to access their mail.  There could probably be a desktop policy to
> handle this as well.

Ouch, that seems a lot harder to me than setting proxyservers.

> However, you are 100% correct that replication would appear to be a far
> less complex solution.  After reading through the available
> documentation, it wasn't clear to me that it was possible to do
> replication without setting up a murder, a complexity I was hoping to avoid.
>
> So, here's the feeble-mindedness component:  I didn't completely follow
> your explanation for setting up a replication server.  It would be
> awesome to have a howto for doing this -- is anyone aware of anything
> like this; i.e. howto set up a replication server outside the murder
> context.

Then please take a look at the replication page on the Project Cyrus 
website:
http://cyrusimap.org/docs/cyrus-imapd/2.4.17/install-replication.php

Here's my earlier example with the murder components stripped out, and 
some commenting added:

Both servers (note last entry):

/etc/services

    lmtp		24/tcp
    imap2		143/tcp
    imap2		143/udp
    imaps		993/tcp
    imaps		993/udp
    sieve		4190/tcp
    *csync		2005/tcp*

Master server:

/etc/imapd.conf

    ...
    ##
    # These configuration parameters are for the master server
    # in a replication set

    # The list of userids with administrative rights
    admins: cyrus

    ##
    # Replication support
    # This is how the BACKEND for this host is defined
    sync_host: replica.example.com
    sync_authname: mailproxy
    sync_password: <password>
    sync_realm: <if required for your auth scheme>

    # Whether to compress the replication stream, important if using WAN links
    sync_compress: true

    # To enable "rolling" replication, set this to TRUE
    # This causes all data altering daemons, such as imapd, lmtpd, etc. to log their
    # actions for replication.
    sync_log: true

    # Minimum interval (in seconds) between replication runs in rolling replication mode.
    sync_repeat_interval: 5

    # A file whose existence will cause the sync_client to stop at its next opportunity
    sync_shutdown_file: /var/run/cyrus/sync_stop
    ...

/etc/cyrus.conf

    ...
    SERVICES {
    	...
    	syncclient		cmd="/usr/lib/cyrus/bin/sync_client -r"
    	...

Replica server:

## /etc/imapd.conf

    ...
    ##
    # These configuration parameters are for the replica server in a
    # replication cluster

    # The list of userids with administrative rights
    # For a replica, this must include the user with which the master
    # will authenticate
    admins: cyrus mailproxy

    ##
    # Unless you're using TLS between master and replica, add this
    force_sasl_client_mech: PLAIN
    master_mechs: PLAIN

## /etc/cyrus.conf

    ...
    SERVICES {
    	...
    	syncserver       cmd="/usr/lib/cyrus/bin/sync_server" listen="csync"
    	...

Here's some extra notes:

  * The webpage listed above on replication explains rolling replication
    (think "log shipping" from the DB world) as well as manual
    replication.  Check that out.
  * We find that it doesn't hurt to use both rolling and periodic
    replication, and have cron handle the latter
  * If the master stops listening for csync traffic, when halted for a
    snapshot, for example, then the sync_server process on the replica
    will die.  So, we use a nanny cronjob to make sure that one gets
    started if none are running.

Here's our crontabs for master and replica:

Master:

    ### Ensure replication is up to date
    30 5 * * * /usr/local/sbin/cyrus_user_sync.pl >/dev/null 2>&1
    ##
    ### Run quota check script
    30 6 * * * /usr/local/sbin/quota-report >/dev/null 2>&1
    ##
    ### Update mailbox annotations
    45 6 * * * /usr/local/sbin/set_cyrus_annotations.sh >/dev/null 2>&1
    ##
    ### Update quotas
    */5 * * * * /usr/local/sbin/cyrus_ldap_quota.pl >/dev/null 2>&1

Replica:

    ##
    # ensure that the sync_client keeps running.  Comment this out
    # following promotion from replica to master.
    @hourly	/usr/local/sbin/sync_nanny.sh >/dev/null

We'll be happy to share these scripts with anyone who'd care to have a 
copy, but they might be specific to our use of LDAP to manage account 
details.  The idea of each, however, is to leverage the account DB, 
which in our case is almost always LDAP, to maintain, update or alter 
the cyrus account information.

>> However, I must be honest and point out that if you're going to go to
>> the trouble of figuring out how to use imapsync (and possibly pay for
>> it, to boot) you may as well just set up a replica.  As I've shown,
>> above, it's just not that hard.
> Imapsync is still useful for migrating individual users from one imap
> server to another.  In my case, I'm migrating from a cyrus 2.3.x server
> using Berkeley db metadata files to a cyrus 2.4.x server which will be
> entirely skiplist based.  Understood that you can convert db files to
> skiplists, but I feel most comfortable using imapsync for this.  In this
> use case there are only a handful of users, but they all have extremely
> complex and massive mail folders.
>
> My current plan is to use imapsync for the migration and then
> replication to another dummy server for backup, assuming I can figure
> out how to set up replication.

I strongly recommend against this course of action.  If you're migrating 
between two boxes, which it sounds like you are, then you're much better 
off rsyncing the spool data between them (once you've halted cyrus) and 
then allowing cyrus to perform the necessary DB updates.

Check the Install-Upgrades page for anything else which changes between 
your versions of cyrus.  Since you didn't specify which 2.3.x or 2.4.x 
you're using, I can't tell you what you'll need, but you'll find that 
info in doc/install-upgrade.html of your version. If you're installing 
from packages this may not be included, so do yourself a favor and 
download a copy for reference.

As the upgrade guide states (emphasis added):

    The default type for all databases is now skiplist which is very
    reliable now, all the bugs are ironed out! *Because ctl_cyrusdb -r
    automatically converts databases between known types, you shouldn't
    need to do anything*, but if you want to keep the old defaults,
    you'll need to make them explicit in your imapd.conf as follows:

    duplicate_db: berkeley-nosync
    ptscache_db: berkeley
    statuscache_db: berkeley-nosync
    tlscache_db: berkeley-nosync

You have said you want skiplist, so you needn't add those settings, just 
make sure you remove any that exist if you copy your old imapd.conf file 
over.

If you prefer to manually convert the DB files, you can do this with the 
supplied cvt_cyrusdb tool:

    $ /usr/lib/cyrus/bin/cvt_cyrusdb /tmp/annotations.db berkeley /var/lib/imap/annotations.db skiplist

    or for Ubuntu
    $ cyrus cvt_cyrusdb /tmp/annotations.db berkeley /var/lib/imap/annotations.db skiplist

Note that in this case, you should NOT rsync the DB files into the new 
server's /var/lib/imap (or whatever your config directory is) but rather 
into a holding area, like /tmp, from which you can read them for the DB 
conversion.

Also, make sure you do all of this as the cyrus user, or you'll end up 
with permissions problems.

Good luck!
     -nic

>
> Thanks again for your helpful comments!
>
> ----
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

-- 
Nic Bernstein                             nic at onlight.com
Onlight, Inc.                             www.onlight.com
219 N. Milwaukee St., Suite 2a            v. 414.272.4477
Milwaukee, Wisconsin  53202

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20141219/f4f8b250/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nic.vcf
Type: text/x-vcard
Size: 271 bytes
Desc: not available
Url : http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20141219/f4f8b250/attachment-0001.vcf 


More information about the Info-cyrus mailing list