MESSAGE quota resource implemention

Fri Sep 2 06:03:49 EDT 2011

On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:
> If the software was robust, underflow would not happen and we would not
> need to test for it and handle it.  Thus the log messages are not
> operational messages intended for the sysadmin, but warnings about
> internal Cyrus problems intended for Cyrus developers, and syslog is a
> suitable place for them.

In theory, yes - assuming you can keep blackbox control over all the
filesystems that Cyrus is operating over, and the sysadmin never restores
from backup or otherwise screws with any of the underlying files.

> How's about this for a strategy?
> 
> When a quota resource is first enabled, (i.e. the limit is changed from
> UNLIMITED to some finite value), the usage is stored as some special
> value which I'll call INDETERMINATE.

What about 'getquota'?  I don't support any solution which leaves getquota
returning bogus values or failing to respond.  That's just icky and
confusing.

I don't think you can avoid two passes, and I don't even think you can
avoid two values during if you really want to be good about it.

Anyway - moving a new folder into a quotaroot is NOT racy.  You just need
to read quota_mailbox_used on the mailbox, then lock the old quota root,
subtract N bytes and unlock it - then lock the NEW quota root, add N bytes
and unlock it again.  No problem.

The only issue is updating WITHOUT changing the quotaroot, which is an
issue because a particular mailbox doesn't know if it's already been counted
in the new quota value or not, so if it should be updating the value.

There are pure ways to do this, that guarantee consistency.  I think the best
way is probably some sort of A/B thing, where you label the quotaroot as A
or B in the mailbox - AND in the quota root.  So the initial state looks like
this:

ROOT: A
A: $usage
B: INVALID

When you want to run a quota -f you set 'B' to zero, and then run the update
logic over all mailboxes, updating B with the value as you go, and setting
the quotaroot in the mailbox to be in state 'B', so it also updates B.
Any mailbox in state 'B' will update both A AND B, because the root is still
in state A.  Mailboxes in state 'A' will only update 'A', because they match
the root.

When you have finished quota -f, both values are being updated simultaneously
by all mailboxes.  You also have two fields which you can compare, and a
guarantee that they were both atomically updated, so if they're not the same
then there was definitely corruption, not just a race condition.  So you can
report that.

Then you update the quota root to say 'ROOT: B', and 'A' invalid.  Or even
just A: zero.  If anything continues to update the wrong field then you
also have corruption (probably a mailbox outside the quotaroot pointing to
it, which is pretty silly)

That's a real, robust solution.  But it's pretty heavy engineering.

> As far as I can see the only point of that first append_check() call is
> to fail early in the case of a permission fail.

It's nice to fail before the client starts uploading.  Failing any later
than that is kinda pointless, because you're not saving the client
bandwidth - which may matter.  The disk IO is unlikely to matter to the
server, since it's a rare case.

> convinced I got it very wrong, and should have left all the quota
> updating in mailbox_commit_quota().  I was trying hard to avoid adding a
> field to the index header to track the storage used by all the
> annotations for the mailbox and for messages in the mailbox; but I'm
> really not happy with the results :(

Well, we're not committed to keeping it that way - it's not as if it's
even "in the wild" except for some really early adopters of the master
branch, who deserve whatever pain they get (mostly, that's us - and we
know enough to be able to clean up any mess)

Bron.