What would it take for FastMail to run murder

Bron Gondwana brong at fastmail.fm
Fri Mar 13 22:29:00 EDT 2015


For sure :)

Just having testing infrastructure that tests murder would go a long way
to avoiding that mess again.

The more I think about it, the more having the SAME mailboxes.db for
both local and remote data doesn't make sense. We should have a separate
central database that the mupdate_activate, etc write to. It can just be
a standalone SQL database, or a cluster database, or who cares... the
main thing is that only a few of the MBOXLIST commands need to care
(because they will return the remote information if needed)

Bron.


On Sat, Mar 14, 2015, at 09:54 AM, Dave McMurtrie wrote:
> From my phone, so excuse brevity and top-posting, but Fastmail running
> murder would be a huge bonus. I not-so-fondly recall the intimate
> relationship I developed with gdb debugging murder issues when we
> upgraded from 2.3 to 2.4 :)
>
>
> Sent via the Samsung GALAXY S® 5, an AT&T 4G LTE smartphone
>
>
>
>
-------- Original message --------
>
From: Bron Gondwana <brong at fastmail.fm>
>
Date:03/13/2015 6:50 PM (GMT-05:00)
>
To: Cyrus Devel <cyrus-devel at lists.andrew.cmu.edu>
>
Cc:
>
Subject: What would it take for FastMail to run murder
>
>
> So I've been doing a lot of thinking about Cyrus clustering, with the
>
underlying question being "what would it take to make FastMail run a
>
murder". We've written a fair bit about our infrastructure - we use
>
nginx as a frontend proxy to direct traffic to backend servers, and have
>
no interdependencies between the backends, so that we can scale
>
indefinitely. With murder as it exists now, we would be pushing the
>
limits of the system already - particularly with the globally
>
distributed datacentres.
>
>
Why would FastMail consider running murder, given our existing
>
nice system?
>
>
a) we support folder sharing within businesses, so at the moment we are
>
limited by the size of a single slot. Some businesses already push
>
that limit.
>
b) it's good to dogfood the server we put so much work into.
>
>
Here are our deal-breaker requirements:
>
>
1) unified murder - we don't want to run both a frontend AND a backend
>
imapd process for every single connection. We already have nginx,
>
which is non-blocking, for the initial connection and auth handling.
>
2) no table scans - anything that requires a parse and ACL lookup for
>
every single row of mailboxes.db is going to be a non- starter when
>
you multiply the existing mailboxes.db size by hundreds.
>
3) no single-point-of-failure - having one mupdate master which can stop
>
the entire cluster working if it's offline, no thanks.
>
>
Thankfully, the state of the art in distributed databases has moved a
>
long way since mupdate was written. We'd have to at least change the
>
mupdate protocol anyway to handle newly added fields, so why not just do
>
away with it and have every server run a local node of a distributed
>
database protocol for its mailboxes.db.
>
>
Along with this, we need a reverse lookup for ACLs, so that any one user
>
doesn't ever need to scan the entire mailboxes.db. This might be hooked
>
into the distributed DB as well, or calculated locally on each node.
>
>
And that's pretty much it. There are some interesting factors around
>
replication, and I suspect the answer here is to have either multi-
>
value support or embed the backend name into the mailboxes.db key
>
(postfix) such that you wind up listing the same mailbox multiple
>
times. We already suppress duplicates in the LIST command, so all we
>
need then is logic for choosing the actual master. Rob N has done some
>
work with consul and etcd already at FastMail, and we would use either
>
that or a flag in the distributed DB to drive master choice for backend
>
connection purposes.
>
>
There are a bunch of "nice to have"s on top of this, but I think this
>
would be enough for us to convert our existing standalone servers over
>
to a murder.
>
>
Bron.
>
>
--
>
Bron Gondwana
>
brong at fastmail.fm
>

--
Bron Gondwana brong at fastmail.fm


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.andrew.cmu.edu/pipermail/cyrus-devel/attachments/20150314/bb05b9d5/attachment.html 


More information about the Cyrus-devel mailing list