Setting TCP keepalive for Cyrus daemons
brong at fastmail.fm
Fri Feb 12 16:46:32 EST 2010
On Fri, Feb 12, 2010 at 09:45:02AM -0600, Gary Mills wrote:
> I've been noticing idle pop3d processes on our Cyrus front end server
> for some time. These should be transient. One that was several days
> old had an established TCP connection to a wireless client that had
> disappeared. Presumably the client never closed the connection.
> Setting TCP keepalive on the file descriptor should permit the kernel
> to close the connection in this situation. Does this sound
> reasonable? Perhaps it's already been addressed in a later Cyrus
> version. We're running cyrus-imapd-2.3.8.
> I'm willing to add a `keepalive' option to Cyrus master along with the
> setsockopt() system call to enable that setting. This option could be
> added to the cyrus.conf file for any services that could benefit from
> it. Would this be a reasonable addition to Cyrus?
This is something I've been wanting to look in to as well. If a machine
crashes, it can leave sync_clients hanging for ever, thinking they are
still talking to the replica - meaning that they hold the lock and
replication doesn't start back up. Quite annoying.
Is there any reason to make it an option rather than just always having
it on? Or at least not to make it the default. If it's a good idea, it
SHOULD be the default. I'm strongly against having hundreds of lines of
config file required to get the sanest defaults!
Implementors MAY include "keep-alives" in their TCP
implementations, although this practice is not universally
accepted. If keep-alives are included, the application MUST
be able to turn them on or off for each TCP connection, and
they MUST default to off.
Keep-alive packets MUST only be sent when no data or
acknowledgement packets have been received for the
connection within an interval. This interval MUST be
configurable and MUST default to no less than two hours.
Sounds like it does need to be configurable! Also, the defaults are
crazy high amounts for a local reliable network. I think I'd prefer
about 5 minutes than 2 hours!
More information about the Info-cyrus