painful mupdate syncs between front-ends and database server

Mon Oct 19 19:03:52 EDT 2009

On Mon, 19 Oct 2009, Michael Bacon wrote:

> --On October 19, 2009 2:13:03 PM -0700 Andrew Morgan <morgan at orst.edu> wrote:
>
>> What is causing a (re)sync of the frontends?  Normally this should only
>> happen when you start Cyrus on a frontend, right?
>
> I am not entirely sure.  I think what may be happening is that the slave 
> mupdate requests get some kind of timeout, and end up disconnecting.  As soon 
> as they reconnect, they want to re-sync.  I've upped the 
> "mupdate_retry_timeout" to 10 minutes, so most of the time, they'll only 
> timeout once, then the next retry will be successful.  This solved a constant 
> re-sync issue we had early on, but apparently hasn't solved the problem 
> entirely.

>>> During these sync periods, we see two negative impacts.  The first is
>>> lockup on the mailboxes.db on the front-end servers, which slows down
>>> both accepting new IMAP/POP connections and the reception of incoming
>>> messages. (The front-ends also accept LMTP connections from a separate
>>> pair of queueing hosts, then proxy those to the back-ends.)  The second
>>> is that, because the front-ends go into a
>> 
>> A part of this paragraph was chopped off.  What else did you have to say?
>
> Sorry, must have blanked on that.  The front-ends go into a sync cycle, which 
> ties up the MUPDATE server while they download the database (which can take 
> up over two minutes).  This causes a similar halt on anything that was 
> responding to a mupdate "kick" on the clients, which appears to stop up a 
> decent amount of inbound mail.

Yeah, normally I take a frontend out of rotation (hardware load balancer) 
before I restart cyrus, for this very reason.

> Interesting.  We're running skiplist everywhere, after some nasty experiences 
> I've had with bdb, but that's a pretty astonishing performance difference.

We went with skiplist to avoid the hassle of Berkeley DB upgrades.

> I'm pretty sure we can solve the problem by adding additional I/O capacity to 
> the mailboxes.db on the front-ends, but it's kind of frustrating that we have 
> to.  I've considered putting those in a swap-mounted file system, but that 
> makes me a bit nervous.

I think it would be more useful to understand why your frontends need to 
resync outside of a restart.  Anything else is just a work-around.

 	Andy