What would it take for FastMail to run murder

Sat Mar 14 16:18:48 EDT 2015

On 2015-03-13 23:50, Bron Gondwana wrote:
> So I've been doing a lot of thinking about Cyrus clustering, with the
> underlying question being "what would it take to make FastMail run a
> murder".  We've written a fair bit about our infrastructure - we use
> nginx as a frontend proxy to direct traffic to backend servers, and 
> have
> no interdependencies between the backends, so that we can scale
> indefinitely.  With murder as it exists now, we would be pushing the
> limits of the system already - particularly with the globally
> distributed datacentres.
> 
> Why would FastMail consider running murder, given our existing
> nice system?
> 
> a) we support folder sharing within businesses, so at the moment we are
>    limited by the size of a single slot.  Some businesses already push
>    that limit.
> 

How, though, do you "ensure" that a mailbox for a new user in such 
business is created on the same backend as all the other users of said 
business?

> Here are our deal-breaker requirements:
> 
> 1) unified murder - we don't want to run both a frontend AND a backend
>    imapd process  for every single connection.  We already have nginx,
>    which is non-blocking, for the initial connection and auth handling.
> 

There's one particular "problem" with using NGINX as the IMAP proxy -- 
it requires that external service that responds with the address to 
proxy to.

I say "problem" in quotes to emphasize I use the term "problem" very 
loosely -- whether it be a functioning backend+mupdate+frontend or a 
functioning backend+mupdate+frontend+nginx+service is a rather futile 
distinction, relatively speaking.

> 2) no table scans - anything that requires a parse and ACL lookup for
>    every single row of mailboxes.db is going to be a non- starter when
>    you multiply the existing mailboxes.db size by hundreds.
> 

I don't understand how this is an established problem already -- or not 
as much as I probably should. If 72k users can be happy on a murder 
topology, surely 4 times as many could also be happen -- inefficiencies 
notwithstanding, they're "only" a vertical scaling limitation.

That said of course I understand it has it's upper limit, but getting 
updated lookup tables in-memory pushed there when an update happens 
would seem to resolve the problem, no?

> 3) no single-point-of-failure - having one mupdate master which can 
> stop
>    the entire cluster working if it's offline, no thanks.
> 

This is not necessarily what a failed mupdate server does though -- new 
folders and folder renames (includes deletions!) and folder transfers 
won't work, but the cluster remains functional under both the 
SMTP-to-backend and LMTP-proxy-via-frontend topology -- autocreate for 
Sieve fileinto notwithstanding, and mailbox hierarchies distributed over 
multiple backends when also using the SMTP-to-backend topoplogy 
notwithstanding.

> Thankfully, the state of the art in distributed databases has moved a
> long way since mupdate was written.

I have also written a one-or-two line patch that enables backends that 
replicate, to both be a part of the same murder topology, to prevent the 
replica "slave" from bailing out on the initial creation of a mailbox -- 
consulting mupdate and finding that it would already exist.

> Along with this, we need a reverse lookup for ACLs, so that any one 
> user
> doesn't ever need to scan the entire mailboxes.db.  This might be 
> hooked
> into the distributed DB as well, or calculated locally on each node.
> 

I reckon this may be the "rebuild more efficient lookup trees in-memory 
or otherwise" I may have referred to just now just not in so many words.

> And that's pretty much it.  There are some interesting factors around
> replication, and I suspect the answer here is to have either multi-
> value support or embed the backend name into the mailboxes.db key
> (postfix) such that you wind up listing the same mailbox multiple
> times.

In a scenario where only one backend is considered "active" for the 
given (set of) mailbox(es), and the other is "passive", this has been 
more of a one-line patch in mupdate plus the proper infrastructure in 
DNS/keepalived type of failover service IP addresses than it has been 
about allowing duplicates and suppressing them.

Kind regards,

Jeroen van Meeuwen

-- 
Systems Architect, Kolab Systems AG

e: vanmeeuwen at kolabsys.com
m: +41 79 951 9003
w: https://kolabsystems.com

pgp: 9342 BF08