High Availability

Bron Gondwana brong at fastmail.fm
Wed Apr 22 00:51:03 EDT 2015


On Wed, Apr 22, 2015, at 02:27 PM, Ciro Iriarte wrote:
> Interesting, is the use of several instances needed because cyrus
> cannot scale with threads in a single instance scenario?


There are two interesting reasons:

1) global locks. There are some - mailboxes.db for example. If you have
   multiple instances on a single machine, then a lock never blocks up
   the entire machine.

2) replication and load spreading - right now there's no support for
   "partial replica" - a Cyrus instance replicates every mailbox to
   its replica.

The second one is the kicker.

If we replicated everything from one machine to another machine, then
we'd have 100% user load on one machine and nothing on the other - not
efficient use of resources, because the second one needs to have the
capacity to run at 100% in a failover situation too.

Our first thought was to run two instances per machine and pair them -
so there was a master on one and a replica on the other. At least then
we're running equally in the general situation, and only in a failover
situation are we loaded 100%. But it's still nasty - you go from 50%
load to 100% load.

So we have about 10 different replicas for each machine, and every
machine is running at 50% capacity. If we need to take one machine down,
then 10 other machines run at 55% capacity instead for that time. The
load change is much less.

(as of about a year ago, we're fully paired odd-host-number to
even-host-number, and odd and even are in different cabinets, so we can
shut down an entire cabinet by raising the load on its replicas)

Bron.

--
Bron Gondwana brong at fastmail.fm


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20150422/ef8178ee/attachment.html 


More information about the Info-cyrus mailing list