problematic upgrade 2.3.16 -> 2.4.3

Paul Dekkers Paul.Dekkers at surfnet.nl
Wed Nov 10 15:17:02 EST 2010


Hi,

Here is a specific example of a situation I don't know to solve.

The user "mm" is not logged in. The mailbox exists:

# sudo -u cyrus /usr/local/cyrus/bin/ctl_mboxlist -d | grep user.mm
user.mm 0 default mm    lrswipcda
user.mm.Drafts  0 default mm    lrswipcda
user.mm.Junk    0 default mm    lrswipcda
user.mm.Sent    0 default mm    lrswipcda
user.mm.Trash   0 default mm    lrswipcda

The user fails to login (and lmtp fails):

imap[95676]: Failed to append cache to user.mm for 3568
imap[95676]: Index upgrade failed: user.mm
imap[95676]: IOERROR: locking index user.mm: No such file or directory

And a reconstruct fails, with -r and with -rf or -G:

reconstruct[95674]: reconstructing user.mm
reconstruct[95674]: Failed to append cache to user.mm for 3568
reconstruct[95674]: Index upgrade failed: user.mm
reconstruct[95674]: IOERROR: locking index user.mm: No such file or
directory

This file 3568 just exists on disk. Reconstruct dumps core.

Reconstruct tries to create the new index, but there's hardly anything
in there, look at the .NEW size:

-rw-------   1 cyrus  cyrus  3929540 Nov 10 21:03 cyrus.cache
-rw-------   1 cyrus  cyrus   175656 Nov 10 04:00 cyrus.expunge
-rw-------   1 cyrus  cyrus      165 Feb 10  2010 cyrus.header
-rw-------   1 cyrus  cyrus    45504 Nov 10 18:47 cyrus.index
-rw-------   1 cyrus  cyrus      896 Nov 10 21:03 cyrus.index.NEW
-rw-------   1 cyrus  cyrus  2365296 Nov 10 02:01 cyrus.squat

I can't really figure out from a truss reconstruct why it fails, it
looks like it's just going through the messages with utimes and fstat.

If I move the metadata to UFS instead of ZFS, that didn't help, neither
did moving the messages to UFS help. So it's not ZFS-related I suppose.
It keeps dumping core.

Any idea?

(Can I downgrade? I just need to reconstruct everything, right, because
the index format changed? :-S)

Regards,
Paul


On 10-11-10 20:49, Paul Dekkers wrote:
> Hi,
> 
> I intentionally waited a few 2.4-releases to have the first dirty bugs
> smashed ;-)
> 
> In a small test-setup everything was fine. But on a box with actual
> users on it, I seem to have some more problems :-(
> 
> The machine runs FreeBSD 8.1 (64-bits) with Cyrus from the port. I'm
> using ZFS for the imap and metadata partitions, my /var/imap is on UFS.
> This was all fine with 2.3.16.
> 
> Initially I had problems with my own mail. A sync_client seemed to stall
> on my mailbox, and truss told me it was waiting for a lock:
> 
> open("/var/imap/lock/user/paul.lock",O_RDWR|O_CREAT|O_TRUNC,0666) = 4 (0x4)
> 
> I didn't really get why. Quitting my 2 Thunderbird sessions solved that.
> 
> Also, a pop3 session took forever during this lock. I had imap clients
> open to my INBOX, the authentication succeeded I noticed in the logs,
> but after PASS it took forever (so I quit the telnet).
> 
> I see a lot of other disturbing errors, like reconstruct dumping core,
> and all kinds of IOERROR's:
> 
> lmtp[94506]: Failed to append cache to user.bla for 747
> lmtp[94506]: Index upgrade failed: user.bla
> lmtp[94506]: IOERROR: locking index user.bla: No such file or directory
> master[93701]: process 94506 exited, signaled to death by 6
> 
> And another user now gives this error:
> 
> imap[95315]: Failed to append cache to user.astrid.Junk for 35
> imap[95315]: Index upgrade failed: user.astrid.Junk
> imap[95315]: IOERROR: locking index user.astrid.Junk: Bad file descriptor
> 
> Not sure what 35 means anyway, there's no file named like that.
> 
> Reconstruct dumped core for another user:
> 
> reconstruct[94584]: reconstructing user.bla
> reconstruct[94584]: Failed to append cache to user.bla for 747
> reconstruct[94584]: Index upgrade failed: user.bla
> reconstruct[94584]: IOERROR: locking index user.bla: No such file or
> directory
> kernel: pid 94584 (reconstruct), uid 60: exited on signal 6
> 
> ... another folder after reconstruction:
> 
> imaps[94936]: IOERROR: invalid cache record for user.paul.Sent uid 180
> (System I/O error)
> 
> That's strange,
> 
> And now my sync_client is complaining with:
> 
> sync_client[94963]: Fatal error: waitpid failed
> 
> ARGH! Any advise? Could these problems be FreeBSD related?
> 
> Sounds like a huge reconstruct might be worth it, still trying to figure
> out of the reconstruct -G -r user.paul I did on my inbox solved most
> issues for my user.
> 
> Regards,
> Paul
> 
> P.S. I think there was one cosmetic issue. I reconstructed one users
> mailbox, and after restarting the sync_client, on my replica, I noticed:
> syncserver[35643]: Deleted mailbox user.hmm
> ... while this mailbox was not deleted, and fortunately it was properly
> synced.
> ----
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/



More information about the Info-cyrus mailing list