Frontend becomes unusable when one backend dies

Frank Richter frank.richter at hrz.tu-chemnitz.de
Fri Jul 6 08:47:46 EDT 2007


> I assume that the frontend is also the murder master?

No, the mupdate master is on another server.

>  Other  pertinent questions are, "How many connection do you normally get?"  

Just now this number of daemons are running:
    Imapd/s: 327/458 Pop3d/s: 10/5
Theymay get up to 700 ...
The limit on imapd / imaps in cyrus.conf is 1000 each.
We get 6 - 10 IMAP/POP logins per second.

Ok, counting: 5 connections to the failing backend per second, 
10 seconds timeout ... there should be enough free daemons available.

> and "In what way is the backend 'down'?"

Machine down ... crash, not reachable (not just imapd down).
Only one IPv4 address involved.

> The client_timeout sets an alarm that interrupts the connect system  
> call.  The frontend may try more than once, tho, if the backend has  
> more than one address, e.g., IPv4 and IPv6.  Are you observing imapd  
> and pop3d on the frontend that are waiting more than client_timeout  
> to give up?  As they fail to connect, clients should log:
> 
> 	connect(server-name) failed: timed out

Yes, it's here: connect(server-name) failed: Connection timed out
Ok, this is exactly 10 seconds (== client_timeout) after the login 
message.

> Another possibility is that the clients are poorly behaved, e.g.,  
> they are getting an error on SELECT, but don't close the connection  
> to the frontend.  The client_timeout is just controlling the timeout  
> of the connect from the frontend to the backend, not the duration of  
> the life of the frontend processes.  For imapd, the timeout is 30  
> *minutes*.

Oh, well. I will get a "chance" to test this situation again next week 
(kernel upgrade on backends).
I'll try with enlarging the maxchild in cyrus.conf and/or decreasing the 
client_timeout in imapd.conf.

Thanks for your ideas and help!

- Frank

> > I'm running a simple standard murder environment (v2.3.8 on Linux  
> > x86_64)
> > - one frontend, two backends. If one of the two backends is down,
> > then the frontend (and the whole system) becomes unavailable after  
> > some
> > minutes:
> >
> > On the frontend imapd's and pop3d's are started till the maximum
> > count (maxchild in cyrus.conf) is reached. It seems that they're still
> > trying to reach the "dead" backend server.
> >
> > Maybe it is a timeout issue - I let the default client_timeout
> > (10 seconds) in imapd.conf. Is this value relevant for this behavior?

-- 
Email: Frank.Richter at hrz.tu-chemnitz.de  http://www.tu-chemnitz.de/~fri/
Work:  Computing Services,  Chemnitz University of Technology,  Germany


More information about the Info-cyrus mailing list