quota_db recommendations
Raphael Jaffey
rjaffey at artic.edu
Wed Jan 7 18:12:00 EST 2009
I've encountered a bug in the 2.3.13 IMAP server running on RHEL AS
4.6, which appears to be related to the use of the default 'quota_db:
quotalegacy' and certain patterns within the user namespace.
When there are a large number of entries in a branch of the quota
hierarchy which share a common root in the username and a new entry is
created which matches the common root exactly, the user matching the
common root cannot replicate and the user's inbox cannot be deleted.
Example:
There are 50 users ranging from aa0 to aa49 all having a quota on
their inboxes (mailboxes user.aa0, user.aa1, etc, and
quota/a/user.aa0, quota/a/user.aa1,...).
Replication is active.
Creating a user aa (mailbox user.aa) causes replication to fail
repeatedly until the "USER aa" record is removed from the replication
log.
Deleting the mailbox ("sam user.aa cyrus all", "dm user.aa") causes
either a segfault in glibc's _int_malloc(), a "*** glibc detected ***
double free or corruption (!prev):...", or "*** glibc detected ***
corrupted double-linked list:" error reported by the imap process.
There may be other errors as well.
While I'm in the process of testing a number of system configurations
(RHEL AS 4 and 5, vanilla cyrus, invoca cyrus package) and will submit
details to bugzilla, I need to address this problem immediately as
this affects a production environment with close to 20K inboxes.
So far, I've eliminated the aspect which affects replication by
switching the quota_db to something other than 'quotalegacy' on the
production replicant. I've also confirmed in a test environment that
switching quota_db to something other than 'quotalegacy' eliminates
the deletion problem and am considering doing this on the production
IMAP server. I've been successful with 'skiplist', but unsuccessful
with both 'berkeley' and 'berkeley-hash'. In the latter two cases,
while processing close to 40K 'sq user... storage ...' commands the
associated IMAP process freezes at random points in the list of 'sq'
commands and no further manipulation of quotas are possible via IMAP
or the 'quota' binary. There doesn't seem to be a way to recover when
using 'berkeley', however, it appears that when using 'berkeley-hash',
restarting cyrus (which runs 'ctl_cyrusdb -r') temporarily enables
manipulation of quotas by the aforementioned methods. Needless to
say, I'm not sure I trust either 'berkeley' or 'berkeley-hash' at this
point.
Does anyone have experience using "flat" or "skiplist" for quotas?
Are either of these suitable for a moderately high volume production
environment of the scale I mentioned?
More information about the Info-cyrus
mailing list