murder setup - mailboxes.db corruption - trouble recovering with ctl_mboxlist

Wesley Craig wes at umich.edu
Thu Nov 20 10:27:01 EST 2008


On 20 Nov 2008, at 07:39, Eric G. Wolfe wrote:
> Found this:
> http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/ 
> CyrusMurderFailureModes
> First, I followed the "Easy" instructions, which was a bust.

Did you see this:

	https://bugzilla.andrew.cmu.edu/show_bug.cgi?id=2819

If you're not running something pretty current, Easy doesn't work.

> Next, I tried the "Hard" instructions.  Four hours later the mupdate
> master finished synch'ing with the backends.  I started up the
> front-ends, per the instructions.  The front-ends failed to synch with
> the mupdate master.

So, after Hard, you have a good copy on the mupdate master.  In what  
way do the frontends fail to sync?

> So in an effort to try something else.  I figured if the  
> mailboxes.db on
> the front-ends and the master are the same format, I could just  
> shutdown
> the mupdate master; copy the mailboxes.db file over to the front-ends;
> and start everything up.  This was also a bust.

Presuming you're using skiplist, you can in fact just copy  
mailboxes.db between mupdate master and frontends.  However, when  
frontends start up, they insist on getting a full copy of the  
database from mupdate master.  This can take some time, and in older  
versions might error out in various ways.

> Oh, and I am getting these logs on the mupdate master.  However, the
> number in fs.file-nr is nowhere near approaching fs.file-max.   
> There are
> no ulimits on the 'cyrus' user.  There was a maxfds=1024 parameter in
> /etc/imapd.conf.  I tried restarting without this parameter, and it
> seemed I couldn't keep the master process running without it.  If I
> restart the service, it will run fine for a while, but it eventually
> starts complaining again.   So I tried quadrupling the maxfds  
> value, and
> we'll see if that helps.
>
> imapd.conf (excerpt)
> mupdate       cmd="/usr/lib64/cyrus-imapd/mupdate -m" listen=3905
> prefork=1 maxfds=1024
>
> maillog (excerpt)
> Nov 20 07:18:54 mumailmaster mupdate[27227]: refused connection from
> mumailstore01
> Nov 20 07:18:54 mumailmaster mupdate[27227]: warning: cannot open
> /etc/hosts.allow: Too many open files

The high connection rate is caused by mail delivery.  Stock lmtp  
proxy connects to the mupdate master to get backend information,  
instead of referring to the local mailboxes.db.  I have patches for  
2.2.x cyrus, in 2.3.x cyrus, "unified" murder refers to mailboxes.db  
instead of mupdate master.  The fact that lmtp proxy refers to  
mupdate master in any configuration is probably a bug.

With a large mail backlog, plus new inbound mail, this bottleneck is  
a big problem.  Couple that with trying to resync the frontends, and  
mupdate master is an even smaller bottleneck.

> Additionally, I have double-checked all cyrus related service accounts
> and their associated passwords.  Our mupdate service account is
> successfully authenticating on the mupdate master.  I am getting a
> "imap: kick_mupdate: can't connect to target: Connection refused"  
> on the
> front-ends.  However, I can connect to port 3905 on the mupdate  
> master.

The kick_mupdate error is just a signal that the mupdate on the  
frontend is in the process of resyncing.  It can be ignored.

:wes


More information about the Info-cyrus mailing list