What would it take for FastMail to run murder
Jeroen van Meeuwen (Kolab Systems)
vanmeeuwen at kolabsys.com
Sat Mar 14 16:18:48 EDT 2015
On 2015-03-13 23:50, Bron Gondwana wrote:
> So I've been doing a lot of thinking about Cyrus clustering, with the
> underlying question being "what would it take to make FastMail run a
> murder". We've written a fair bit about our infrastructure - we use
> nginx as a frontend proxy to direct traffic to backend servers, and
> have
> no interdependencies between the backends, so that we can scale
> indefinitely. With murder as it exists now, we would be pushing the
> limits of the system already - particularly with the globally
> distributed datacentres.
>
> Why would FastMail consider running murder, given our existing
> nice system?
>
> a) we support folder sharing within businesses, so at the moment we are
> limited by the size of a single slot. Some businesses already push
> that limit.
>
How, though, do you "ensure" that a mailbox for a new user in such
business is created on the same backend as all the other users of said
business?
> Here are our deal-breaker requirements:
>
> 1) unified murder - we don't want to run both a frontend AND a backend
> imapd process for every single connection. We already have nginx,
> which is non-blocking, for the initial connection and auth handling.
>
There's one particular "problem" with using NGINX as the IMAP proxy --
it requires that external service that responds with the address to
proxy to.
I say "problem" in quotes to emphasize I use the term "problem" very
loosely -- whether it be a functioning backend+mupdate+frontend or a
functioning backend+mupdate+frontend+nginx+service is a rather futile
distinction, relatively speaking.
> 2) no table scans - anything that requires a parse and ACL lookup for
> every single row of mailboxes.db is going to be a non- starter when
> you multiply the existing mailboxes.db size by hundreds.
>
I don't understand how this is an established problem already -- or not
as much as I probably should. If 72k users can be happy on a murder
topology, surely 4 times as many could also be happen -- inefficiencies
notwithstanding, they're "only" a vertical scaling limitation.
That said of course I understand it has it's upper limit, but getting
updated lookup tables in-memory pushed there when an update happens
would seem to resolve the problem, no?
> 3) no single-point-of-failure - having one mupdate master which can
> stop
> the entire cluster working if it's offline, no thanks.
>
This is not necessarily what a failed mupdate server does though -- new
folders and folder renames (includes deletions!) and folder transfers
won't work, but the cluster remains functional under both the
SMTP-to-backend and LMTP-proxy-via-frontend topology -- autocreate for
Sieve fileinto notwithstanding, and mailbox hierarchies distributed over
multiple backends when also using the SMTP-to-backend topoplogy
notwithstanding.
> Thankfully, the state of the art in distributed databases has moved a
> long way since mupdate was written.
I have also written a one-or-two line patch that enables backends that
replicate, to both be a part of the same murder topology, to prevent the
replica "slave" from bailing out on the initial creation of a mailbox --
consulting mupdate and finding that it would already exist.
> Along with this, we need a reverse lookup for ACLs, so that any one
> user
> doesn't ever need to scan the entire mailboxes.db. This might be
> hooked
> into the distributed DB as well, or calculated locally on each node.
>
I reckon this may be the "rebuild more efficient lookup trees in-memory
or otherwise" I may have referred to just now just not in so many words.
> And that's pretty much it. There are some interesting factors around
> replication, and I suspect the answer here is to have either multi-
> value support or embed the backend name into the mailboxes.db key
> (postfix) such that you wind up listing the same mailbox multiple
> times.
In a scenario where only one backend is considered "active" for the
given (set of) mailbox(es), and the other is "passive", this has been
more of a one-line patch in mupdate plus the proper infrastructure in
DNS/keepalived type of failover service IP addresses than it has been
about allowing duplicates and suppressing them.
Kind regards,
Jeroen van Meeuwen
--
Systems Architect, Kolab Systems AG
e: vanmeeuwen at kolabsys.com
m: +41 79 951 9003
w: https://kolabsystems.com
pgp: 9342 BF08
More information about the Cyrus-devel
mailing list