Changing LIST (again)

Bron Gondwana brong at fastmail.fm
Thu May 2 01:29:25 EDT 2013


One of my release goals for Cyrus 2.5 is to be correct in our
implementation of every standard that we claim to support.

This is why I emailed the list last week asking if anyone is using the
intermediate SPECIALUSE representation in git.  Since there was no
reply, I'm assuming it will be OK :)  I've just started testing it into
production at FastMail, and will roll it out to all our users later this
week. Assuming performance is acceptable, I'll push that upstream.

Meanwhile, there's one glaring problem remaining, as evidenced by Timo's
imaptest tool.  Our LIST-EXTENDED implementation is broken in some
places. I've already had reports of users of recent versions of some
clients having problems with it.  That sucks.

Rob M (CC'd) and I had a good talk about it yesterday.  We're looking at
switching the default at FastMail to use altnamespace - it works better
with Outlook 2013 and also with some phones.  One problem is that you
can't create subfolders of Inbox.

We've looked at the standard, and we don't see any reason why we can't
allow them, with only user.foo.[Ii][Nn][Bb][Oo][Xx] being special case
denied, but otherwise mapping the names so that if a user creates
folders they map like so:

Alt External       Non-Alt External       Internal
Inbox              INBOX                  user.foo
Inbox.sub          INBOX.Inbox.sub        user.foo.Inbox.sub
INBOX.sub          INBOX.INBOX.sub        user.foo.INBOX.sub
<illegal>          INBOX.INBOX            user.foo.INBOX
<illegal>          INBOX.Inbox            user.foo.Inbox
Drafts             INBOX.Drafts           user.foo.Drafts
Trash              INBOX.Trash            user.foo.Trash
Inbox-2010         INBOX.Inbox-2010       user.foo.Inbox-2010

We would normalise the _display_ output in altnamespace mode to be Title
Case, so "Inbox", and in non-altnamespace mode to be "INBOX", so the
full listing output for those folders would look like this:

alt:
Inbox
Inbox.sub
Drafts
INBOX.sub
Inbox-2010
Trash

non-alt:
INBOX
INBOX.Drafts
INBOX.INBOX    (note: only visible in non-alt)
INBOX.INBOX.sub
INBOX.Inbox    (note: only vibisle in non-alt)
INBOX.Inbox.sub
INBOX.Inbox-2010
INBOX.Trash


We're going to build a bunch of testcases to make sure we get all of
this right, and also test ACLs to make sure we only display the
right things...

------

Speaking of which... implementation details.  The LIST code is a complex
and fragile mess.  mboxlist_find* are not only big, but they make a PILE
of locks because they have to release the lock for every output line
just in case they are waiting on the network.  This makes mailboxes.db a
high contention file.

My plan is to abuse list_p and list_cb slightly here.  list_p is
called under the readlock, so you can't do anything networky in there,
but you CAN allocate some memory.  I plan to use the LIST expressions
for just enough to work out the minimum prefix which can be passed to
cyrusdb_foreach, and then just iterate ALL the records in there and
call list_p.  It will be just smart enough to check if the folder is
definitely visible to the user or definitely NOT visible to the user,
and either skip it entirely or add it to an in-memory list.
Definitely NOT means no groups, so just reading the usernames in the
ACL list is enough.

Every 1024 rows, it will release a record to list_cb, which will do
nothing except allow the lock to release so that a really big server
doesn't starve other threads on LIST.

And if we need the subscription DB, we'll just plug that in too as
a separate call and append stuff on the end of the array.

Then sort and uniquify the records, potentially combining the
mailboxes.db and subscriptions data into a single record.

After this, there will be a separate pass to filter the records to see
if they are visible via ACL checks, a single pass to annotate them with
SPECIALUSE (if required) - again, this can do longer-held locks because
we haven't done any network IO yet.  We can also calculate \HasChildren,
etc.

Finally, we'll sort the records in the order we intend to use for
final list output, and then spit them down the wire after all the
locks are released.

This will use a little bit of memory, but we already read entire 1
million message cyrus.index maps into memory.  Once upon a time, it was
very difficult to build a real-time spell checker for a word processor.
These days you can just slurp the entire dictionary into a hash and look
words up in it.

http://prog21.dadgum.com/29.html

These days, we can afford to just slurp the data into memory and
then process it before outputting.  Tons of clever complex code can
be thrown away and replaced with simpler and more correct code.  So
I'm going to do that.

Bron.

-- 
  Bron Gondwana
  brong at fastmail.fm


More information about the Cyrus-devel mailing list