Problems with load balancing cluster on GFS

Jens Hoffrichter jens.hoffrichter at gmail.com
Fri Jun 6 12:01:46 EDT 2008


Hello,

2008/6/6 Jorey Bump <list at joreybump.com>:

> Yeah, it shouldn't lock with urandom. You might want to play around with
> poptimeout and popminpoll, to see if that has any effect on your load
> balancing test. Is jakarta-jmeter distributing these logins among enough
> different users to simulate real-world conditions? What do your imap/debug
> logs say when the lockup occurs?
Yes, I have configured jmeter to use all those 100 mailbox users in a
round robin fashion, so this should be close to a real world setup.

The log simply stops saying anything, especially about pop3 connections.

But I think I have solved the current problem:

The problem appears to be related to the Berkeley DB environment in
/var/lib/imap/db . Although I don't use that format, as all of the
databases are configured using skiplist, cyrus still initializes the
environment on every connection. And if some other process has locked
the database, it does a futex call on the mmap region, and goes to
sleep. The problem seems to be that with using GFS, it doesn't get a
signal that the database is unlocked, and stays sleeping forever.

I discovered this today when I systematically strace'd (with strace
-p, which apparently sends some kind of signal to the process) all
pop3d processes on one of the hanging machines, and suddenly
everything started to work again, including the hanging note. A closer
examination told me that it then does the futex call again, unlocks
that and just continues.

My solution for this is now that I disabled bdb while compiling, and
everything works like a charm now, though the performance is not yet
there where I expected it to be. But I'm not sure if that is my
loadbalancing test or the cluster config :)

> While I support POP3, I encourage all of my users to use IMAP, so I don't
> have many problems with pop3d (except for brute force attacks, which I
> solved by increasing sasl_minimum_layer, but that won't help you here).
Not an option here, the customer I'm building the cluster for supports
only POP3 to the outside, and IMAP only for the internal webmail app.
So POP3 HAS to run ;)

Regards,
Jens


More information about the Info-cyrus mailing list