murder setup - mailboxes.db corruption - trouble recovering with ctl_mboxlist
Wesley Craig
wes at umich.edu
Thu Nov 20 10:27:01 EST 2008
On 20 Nov 2008, at 07:39, Eric G. Wolfe wrote:
> Found this:
> http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/
> CyrusMurderFailureModes
> First, I followed the "Easy" instructions, which was a bust.
Did you see this:
https://bugzilla.andrew.cmu.edu/show_bug.cgi?id=2819
If you're not running something pretty current, Easy doesn't work.
> Next, I tried the "Hard" instructions. Four hours later the mupdate
> master finished synch'ing with the backends. I started up the
> front-ends, per the instructions. The front-ends failed to synch with
> the mupdate master.
So, after Hard, you have a good copy on the mupdate master. In what
way do the frontends fail to sync?
> So in an effort to try something else. I figured if the
> mailboxes.db on
> the front-ends and the master are the same format, I could just
> shutdown
> the mupdate master; copy the mailboxes.db file over to the front-ends;
> and start everything up. This was also a bust.
Presuming you're using skiplist, you can in fact just copy
mailboxes.db between mupdate master and frontends. However, when
frontends start up, they insist on getting a full copy of the
database from mupdate master. This can take some time, and in older
versions might error out in various ways.
> Oh, and I am getting these logs on the mupdate master. However, the
> number in fs.file-nr is nowhere near approaching fs.file-max.
> There are
> no ulimits on the 'cyrus' user. There was a maxfds=1024 parameter in
> /etc/imapd.conf. I tried restarting without this parameter, and it
> seemed I couldn't keep the master process running without it. If I
> restart the service, it will run fine for a while, but it eventually
> starts complaining again. So I tried quadrupling the maxfds
> value, and
> we'll see if that helps.
>
> imapd.conf (excerpt)
> mupdate cmd="/usr/lib64/cyrus-imapd/mupdate -m" listen=3905
> prefork=1 maxfds=1024
>
> maillog (excerpt)
> Nov 20 07:18:54 mumailmaster mupdate[27227]: refused connection from
> mumailstore01
> Nov 20 07:18:54 mumailmaster mupdate[27227]: warning: cannot open
> /etc/hosts.allow: Too many open files
The high connection rate is caused by mail delivery. Stock lmtp
proxy connects to the mupdate master to get backend information,
instead of referring to the local mailboxes.db. I have patches for
2.2.x cyrus, in 2.3.x cyrus, "unified" murder refers to mailboxes.db
instead of mupdate master. The fact that lmtp proxy refers to
mupdate master in any configuration is probably a bug.
With a large mail backlog, plus new inbound mail, this bottleneck is
a big problem. Couple that with trying to resync the frontends, and
mupdate master is an even smaller bottleneck.
> Additionally, I have double-checked all cyrus related service accounts
> and their associated passwords. Our mupdate service account is
> successfully authenticating on the mupdate master. I am getting a
> "imap: kick_mupdate: can't connect to target: Connection refused"
> on the
> front-ends. However, I can connect to port 3905 on the mupdate
> master.
The kick_mupdate error is just a signal that the mupdate on the
frontend is in the process of resyncing. It can be ignored.
:wes
More information about the Info-cyrus
mailing list