Selection of most fitting partition/backend upon account creation

Bron Gondwana brong at fastmail.fm
Thu Dec 16 17:39:20 EST 2010


On Thu, Dec 16, 2010 at 07:20:09PM +0100, Julien Coloos wrote:
> On our platforms we also give more quota to all users than what we
> could physically store. Sometimes not all users have the same quota
> limit: either because some subscribed to a specific service, or
> because some backends are shared between clients.
> To precisely determine how much space would be left if all users would
> fill their mailbox, we would need to retrieve the quota usage of all
> those mailboxes. Unfortunately we still have legacy quota db (that is:
> one file per mailbox) on most platforms ... and migrating to other db
> formats is not always possible since some clients are very picky about
> the actions we do on 'their' platform ;)

Tell you the truth, I like "legacy" quota DB.  It reduces database
contention for quota updates.  That said, it would be pretty trivial to
perform an in-place upgrade of quota data if we wanted to build something
good enough.

> Considering some backends do host hundreds of thousands of mailboxes,
> determining the quota usage of all users would be quite time consuming
> for us :)

Well, yeah - it's a bit of IO.  I don't know what sort of disk you have
those files on - we've got either SSD or fast RAID1 drives for our meta
partitions, so it's not too expensive to read a hundred thousand mailboxes
(on the order of a few minutes).

Too expensive to be run PER-MAILBOX.  But running it once every 4 hours or
something to update a stats DB wouldn't be out of the question.

> Thus we currently still rely on many tools to watch backends usage and
> perform account moving between backends.

Yeah - any big site is going to want its own monitoring toolkit anyway.

> For the time being we will focus on the code we currently have.
> But there could be another selection mode that may make sense: if we
> consider that backends tend to be homogeneous (as far as quota usage
> is concerned), then comparing the average available space per mailbox
> on each partition could also fit some needs. Though that would still
> require to determine the number of mailboxes on the concerned
> partition.

That's probably site specific to you - we mix and match all over the place.
We try to get a balance of different user types on each "store" as we call
them.  (A store is a pair of "slots in a replication config - there might
be up 40 of them on a single machine replicating to different locations)
to balance out the IO patterns.
 
Bron.


More information about the Cyrus-devel mailing list