Clustering and replication
Janne Peltonen
janne.peltonen at helsinki.fi
Tue Jan 30 04:50:53 EST 2007
On Sat, Jan 27, 2007 at 04:03:18PM +1100, Bron Gondwana wrote:
> On Fri, Jan 26, 2007 at 12:20:15PM -0800, Tom Samplonius wrote:
> > ----- Wesley Craig <wes at umich.edu> wrote:
> [...]
> > > For your situation, Janne, you might want to explore sharing the sync
> > > directory. sync_client and sync_server have interlock code, tho I
> > > haven't reviewed it for this specific scenario.
> > [...]
> > You would probably hit the sync_client bottleneck pretty fast this
> > way.
> > Plus, there would be a lot of contention on the sync logs if
> > multiple servers are appending records to the same file. GFS is
> > not fast.
> Yeah, that would suck. Running multiple sync_clients is going to suck
> too, because they'll get locked out at the replica end where only one
> sync_server will run at once. Messy. I still think one sync_client is
> the way to go if you're going to do this config at all - but the whole
> thing sounds way less scalable than what we're doing at FastMail with
> lots of small cyrus instances (multiple per physical server) and an
> interlaced set of replicas such that losing any one server will spread
> the master load out evenly over many other physical machines.
>
> But I'm not the one who is deciding which tradeoffs to use. I just know
> out current layout has been nice to us and will scale indefinitely. It
> takes a decent amount of management software to decide where to place
> users and make sure we don't break users with folder sharing between
> them and stuff though.
Yes, our idea doesn't scale indefinitely. But then, we have currently
only abt 50 000 users, and since the university isn't going to grow
indefinitely either, I don't suppose we're going to have anything like
thrice that amount ever. And if it works, it's going to be very simple
to maintain.
The number of active users might grow. And the amount of mail received
does grow. Currently, our users have some 1.6 terabytes of mail, and
if the growth continues as before, they'll have some 4 TB by 2010.
20% of the users use 80% of the 1.6 TB. The classical one-fifth of the
population using four-fifths of the resources... ;)
The thing about GFS being slow... there was this performance comparison
by an Italian university that decided to use Veritas instead of GFS
because of its being much faster. But Veritas is closed-source, the
performance difference was something like Veritas being twice as fast as
GFS, and their proposed system was going to serve all Italian
universities and whatnot. So we thought that GFS might be enough for
us...
--Janne
More information about the Info-cyrus
mailing list