upgraded to 2.4.0 today (production)

Bron Gondwana brong at fastmail.fm
Thu Oct 14 22:08:24 EDT 2010

On Thu, Oct 14, 2010 at 08:35:33PM -0400, Carsey, Robert wrote:
> Hello..
> One of my sysadmins decided to upgrade one of our Cyrus servers today.  We ran into an unexpected problem (though manageable).
> I'm now well aware of the index upgrade.. and it appears that when a user logs into the now-upgraded 2.4.0 server, it upgrades the index for each folder the user accesses.  The process seems to be similar to (if not exactly) running 'reconstruct -s' on the mailbox which was accessed.  Its not instantaneous..and does take some time to complete.

Yes, it certainly does.  The IO hit can be a bit high, but it's a one-off
and then your mailboxes are consistent.
> We failed to consider that the ~1200 concurrent connections we usually have to this server are mostly from blackberries, iphones, etc..   When the 2.4.0 server came back online, we got clobbered by everyone at the same time.  The system load rose to ~900, as the disk utilization was pegged at 100% -- all reads from the message storage  partition .. and writes to the metadata partition.

Hmm... perhaps should have made that gotcha more clear.  It does hit pretty
hard and you can't do it bit by bit - it's a full server upgrade!

> At least, this is what I believe to be happening..  so currently I've turned off access to the server (firewall) and am running 'reconstruct -s' on the inboxes.  I'm hoping after that's finished in a few hours, I can let everyone back on and let the system index-upgrade the rest of the folders on-demand.

Yeah, that's fine - you don't even need to "reconstruct -s", you can just
select them all.

> RH5.5 64-bit, 700GB mail, 8900 accounts, 55,000 mailboxes, 8GB mem, 2x 2-core Intel 2.8Ghz,   ~300 io/s on disks.

What sort of disks are they?  That's going to be the killer.  Of course the
real question is: how fragmented is the data.  Under the hood that is "are
these lots of tiny emails or fewer big ones".

But yeah, upgrading overnight or something is good!  You can then run the
"upgrade all mailboxes" task with users still able to log in as well.


