Cyrus Replication (example) [was Re: restore from cyrdump]
Nic Bernstein
nic at onlight.com
Fri Dec 19 10:31:00 EST 2014
On 12/19/2014 06:17 AM, Patrick Goetz wrote:
> Nic,
>
> Thanks for that detailed explanation. I still feel myself somewhat
> stymied by either the documentation (or lack thereof) or perhaps an
> unfortunate case of being somewhat feeble-minded. Here are some follow
> up comments/questions:
>
>
> On 12/18/2014 9:59 AM, Nic Bernstein wrote:
>> I will say that the ability to quiesce the application without halting
>> it would be most desirable. Most databases have supported this sort of
>> thing for ages, and it would be great if one could send a signal to
>> Cyrus to achieve the same result.
> I wonder what would happen if you just stopped lmtp while making a
> snapshot? Would postfix choke on this and start kicking messages back
> to the sender, or would they get queued for later delivery?
> Alternatively, maybe lmtp could temporarily divert new messages to a
> dummy spool so that postfix/sendmail wouldn't have to know anything
> about this. This might be the least painful way to implement quiescence
> in cyrus.
But LMTP is only one method affecting the mail store, IMAP and sieve can
as well. Granted one can brute-force this by shutting down network
ports and the like, but at that point why not just stop cyrus?
> > His initial suggestion -- stop cyrus, snapshot, restart cyrus -- is
> > reasonable, but we feel that the later suggestion -- stop cyrus, tar
> > up data, start cyrus -- is not. It takes data offline for too long.
> > That's why the snapshot capability is necessary in any truly suitable
> > server.
>
> I agree. Here is a substitute proposal (and I'll come back to why I'm
> pushing this point). Serially
>
> 1. rsync user mail files
> 2. rsync configdirectory db files
> 3. rsync user mail files again
>
> That should get you reasonably close to what you get with snapshots.
No, not in the least is this close to a snapshot. Snapshots are
instantaneous, or near to it. The time an rsync takes, even a catch-up,
grows with the size of the mail store and the deltas between attempts.
Also, rsync is not well suited to the file-per-message,
directory-per-mailbox storage scheme of cyrus, as lots of fstats()
result, and this just adds to the time.
I don't understand why one wouldn't use snapshots? Every modern OS and
distro include filesystems or volume managers which support
snapshotting, and several, such as Ubuntu, even recommend
snapshot-capable partitioning schemes out of the box. It's just not
that hard, and it's exactly the right way to handle this sort of staged
backup.
* Halt cyrus
* snapshot critical filesystems
o spool date (/var/spool/imap)
o config data (/var/lib/imap or /var/imap)
o metadata (i.e. /var/run/cyrus)
* start cyrus
* mount snapshot
* rsync or otherwise backup from snapshot
* unmount snapshot
* (optionally) destroy snapshot
This is so easy to handle via a cron or at job. Why would one do this?
If the answer is "legacy system," then fine, but legacies can be
upgraded or replaced.
> If you follow the prescribed cyrus directory structure, then this can be
> simplifed (Arch linux example):
>
> 1. rsync -a --delete /var/imap/user [removable disk/other server]
> 2. rsync -a --delete /var/imap [removable disk/other server]
>
> Once you've rsynced the mail files once, rsyncing them again a short
> time later should be pretty fast. There does need to be a backup
> solution for people who only have one server, hence can't use
> replication or imapsync to do backups.
There is, snapshots, or hosted mail services (like Fastmail :).
>> Lastly, as to the use of imapsync to achieve user, mailbox or server
>> replication,...
>>
>> So your command line is much like Patrick's example, but with '--user1
>> <user> --authuser1 <proxyuser> --user2 <user>...'
>> Of course you must create a proxy user, and Cyrus supports this with the
>> 'proxyserver' directive in imapd.conf (man imapd.conf for details),
>> i.e.: 'proxyservers: proxyuser'.
> Here is the imapd.conf man page entry for proxyservers:
>
> proxyservers: <none>
> A list of users and groups that are allowed to proxy for other
> users, separated by spaces. Any user listed in this will be
> allowed to login for any other user: use with caution. In a
> standard murder this option should ONLY be set on backends.
> DO NOT SET on frontends or things won't work properly.
>
> That capitalized "DO NOT SET on frontends" would seem to be cause for
> concern, especially since I don't understand how this works.
Well then, get thee to a website or man page. :-)
http://cyrusimap.web.cmu.edu/docs/cyrus-imapd/2.4.17/ag.php
No, seriously, this isn't an issue if you're not using a murder. A
"frontend" is the part of a murder aggregation cluster which proxies for
the backend servers which actually hold the mail store. A murder
consists of one or more frontends, one or more backends and a single
"mupdate" master, which controls the canonical copy of the mailboxes
database. In a murder, if one wants to set the proxyservers option, one
sets it only on the backend machines.
The proxyservers option is exactly the right way to do this.
> For people who are
> 1. imapsync'ing between machines both behind a firewall
> 2. using saslauthd with pam
>
> I thought of this solution: Temporarily block port 143 traffic on the
> outward facing port of your firewall, and then add the line
>
> auth sufficient pam_permit.so
>
> to the top of /etc/pam.d/imap files on both the sending and receiving
> imap servers. This should allow you to imapsync the mail stores for
> every user without having to provide passwords. Once you're done,
> simply remove these lines from the PAM configuration files and unblock
> the port on the firewall. Yes, this will mean that users won't be able
> to access their mail from outside the firewall while the imapsync is in
> operation, and this is probably only workable for smaller organizations
> where people are not concerned about their coworkers temporarily being
> able to access their mail. There could probably be a desktop policy to
> handle this as well.
Ouch, that seems a lot harder to me than setting proxyservers.
> However, you are 100% correct that replication would appear to be a far
> less complex solution. After reading through the available
> documentation, it wasn't clear to me that it was possible to do
> replication without setting up a murder, a complexity I was hoping to avoid.
>
> So, here's the feeble-mindedness component: I didn't completely follow
> your explanation for setting up a replication server. It would be
> awesome to have a howto for doing this -- is anyone aware of anything
> like this; i.e. howto set up a replication server outside the murder
> context.
Then please take a look at the replication page on the Project Cyrus
website:
http://cyrusimap.org/docs/cyrus-imapd/2.4.17/install-replication.php
Here's my earlier example with the murder components stripped out, and
some commenting added:
Both servers (note last entry):
/etc/services
lmtp 24/tcp
imap2 143/tcp
imap2 143/udp
imaps 993/tcp
imaps 993/udp
sieve 4190/tcp
*csync 2005/tcp*
Master server:
/etc/imapd.conf
...
##
# These configuration parameters are for the master server
# in a replication set
# The list of userids with administrative rights
admins: cyrus
##
# Replication support
# This is how the BACKEND for this host is defined
sync_host: replica.example.com
sync_authname: mailproxy
sync_password: <password>
sync_realm: <if required for your auth scheme>
# Whether to compress the replication stream, important if using WAN links
sync_compress: true
# To enable "rolling" replication, set this to TRUE
# This causes all data altering daemons, such as imapd, lmtpd, etc. to log their
# actions for replication.
sync_log: true
# Minimum interval (in seconds) between replication runs in rolling replication mode.
sync_repeat_interval: 5
# A file whose existence will cause the sync_client to stop at its next opportunity
sync_shutdown_file: /var/run/cyrus/sync_stop
...
/etc/cyrus.conf
...
SERVICES {
...
syncclient cmd="/usr/lib/cyrus/bin/sync_client -r"
...
Replica server:
## /etc/imapd.conf
...
##
# These configuration parameters are for the replica server in a
# replication cluster
# The list of userids with administrative rights
# For a replica, this must include the user with which the master
# will authenticate
admins: cyrus mailproxy
##
# Unless you're using TLS between master and replica, add this
force_sasl_client_mech: PLAIN
master_mechs: PLAIN
## /etc/cyrus.conf
...
SERVICES {
...
syncserver cmd="/usr/lib/cyrus/bin/sync_server" listen="csync"
...
Here's some extra notes:
* The webpage listed above on replication explains rolling replication
(think "log shipping" from the DB world) as well as manual
replication. Check that out.
* We find that it doesn't hurt to use both rolling and periodic
replication, and have cron handle the latter
* If the master stops listening for csync traffic, when halted for a
snapshot, for example, then the sync_server process on the replica
will die. So, we use a nanny cronjob to make sure that one gets
started if none are running.
Here's our crontabs for master and replica:
Master:
### Ensure replication is up to date
30 5 * * * /usr/local/sbin/cyrus_user_sync.pl >/dev/null 2>&1
##
### Run quota check script
30 6 * * * /usr/local/sbin/quota-report >/dev/null 2>&1
##
### Update mailbox annotations
45 6 * * * /usr/local/sbin/set_cyrus_annotations.sh >/dev/null 2>&1
##
### Update quotas
*/5 * * * * /usr/local/sbin/cyrus_ldap_quota.pl >/dev/null 2>&1
Replica:
##
# ensure that the sync_client keeps running. Comment this out
# following promotion from replica to master.
@hourly /usr/local/sbin/sync_nanny.sh >/dev/null
We'll be happy to share these scripts with anyone who'd care to have a
copy, but they might be specific to our use of LDAP to manage account
details. The idea of each, however, is to leverage the account DB,
which in our case is almost always LDAP, to maintain, update or alter
the cyrus account information.
>> However, I must be honest and point out that if you're going to go to
>> the trouble of figuring out how to use imapsync (and possibly pay for
>> it, to boot) you may as well just set up a replica. As I've shown,
>> above, it's just not that hard.
> Imapsync is still useful for migrating individual users from one imap
> server to another. In my case, I'm migrating from a cyrus 2.3.x server
> using Berkeley db metadata files to a cyrus 2.4.x server which will be
> entirely skiplist based. Understood that you can convert db files to
> skiplists, but I feel most comfortable using imapsync for this. In this
> use case there are only a handful of users, but they all have extremely
> complex and massive mail folders.
>
> My current plan is to use imapsync for the migration and then
> replication to another dummy server for backup, assuming I can figure
> out how to set up replication.
I strongly recommend against this course of action. If you're migrating
between two boxes, which it sounds like you are, then you're much better
off rsyncing the spool data between them (once you've halted cyrus) and
then allowing cyrus to perform the necessary DB updates.
Check the Install-Upgrades page for anything else which changes between
your versions of cyrus. Since you didn't specify which 2.3.x or 2.4.x
you're using, I can't tell you what you'll need, but you'll find that
info in doc/install-upgrade.html of your version. If you're installing
from packages this may not be included, so do yourself a favor and
download a copy for reference.
As the upgrade guide states (emphasis added):
The default type for all databases is now skiplist which is very
reliable now, all the bugs are ironed out! *Because ctl_cyrusdb -r
automatically converts databases between known types, you shouldn't
need to do anything*, but if you want to keep the old defaults,
you'll need to make them explicit in your imapd.conf as follows:
duplicate_db: berkeley-nosync
ptscache_db: berkeley
statuscache_db: berkeley-nosync
tlscache_db: berkeley-nosync
You have said you want skiplist, so you needn't add those settings, just
make sure you remove any that exist if you copy your old imapd.conf file
over.
If you prefer to manually convert the DB files, you can do this with the
supplied cvt_cyrusdb tool:
$ /usr/lib/cyrus/bin/cvt_cyrusdb /tmp/annotations.db berkeley /var/lib/imap/annotations.db skiplist
or for Ubuntu
$ cyrus cvt_cyrusdb /tmp/annotations.db berkeley /var/lib/imap/annotations.db skiplist
Note that in this case, you should NOT rsync the DB files into the new
server's /var/lib/imap (or whatever your config directory is) but rather
into a holding area, like /tmp, from which you can read them for the DB
conversion.
Also, make sure you do all of this as the cyrus user, or you'll end up
with permissions problems.
Good luck!
-nic
>
> Thanks again for your helpful comments!
>
> ----
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
--
Nic Bernstein nic at onlight.com
Onlight, Inc. www.onlight.com
219 N. Milwaukee St., Suite 2a v. 414.272.4477
Milwaukee, Wisconsin 53202
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20141219/f4f8b250/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nic.vcf
Type: text/x-vcard
Size: 271 bytes
Desc: not available
Url : http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20141219/f4f8b250/attachment-0001.vcf
More information about the Info-cyrus
mailing list