Another 2.4 upgrade horror story

Tue Sep 25 08:28:03 EDT 2012

On Tue, September 25, 2012 2:01 pm, Sebastian Hagedorn wrote:
> Hi,
>
>
> about three weeks ago we upgraded our Cyrus installation from 2.3.x to 2.4.16.
> We were aware of the reindexing issue, so we took precautionary
> measures, but they didn't help a lot. We've got about 7 TB of mail data for
> almost 200,000 mailboxes. We did the upgrade on a Sunday and had told our
> users that mail access wouldn't be possible for the whole day. After the
> actual software upgrade we ran distributed scripts that triggered the index
> upgrades. We started with the largest mailboxes. The idea was that after those
> that took the longest had been upgraded, the rest should be OK overnight and
> early Monday. However, even though our storage infrastructure was kept at 99 %
> I/O saturation, progress was much slower than anticipated.
>
>
> Ultimately the server was virtually unuseable for the whole Monday and
> parts of Tuesday. The last mailbox was finally upgraded on Thursday, although
> on Wednesday most things were already working normally.
>
> I realize that some of our problems were caused by infrastructure that's
> not up to current standards, but nonetheless I would really urge you to never
> again use an upgrade mechanism like that. Give admins a chance to upgrade
> indexes in the background and over time.

+1

Sebastian,

Thank you for sharing your experiences.

As a site willing/needing to upgrade from 2.3.16 to 2.4.X this fall, we
are interested in learning about your storage backend characteristics.

What read/write IOPS rates were you registering before/during/after your
upgrade process ?

I'd understand your reluctance to share this information in a public forum.
No offence taken whatsoever !

Kind regards,
Eric Luyten, Computing Centre VUB/ULB,     Eric.Luyten at vub.ac.be