Setting TCP keepalive for Cyrus daemons

Bron Gondwana brong at fastmail.fm
Fri Feb 12 16:46:32 EST 2010


On Fri, Feb 12, 2010 at 09:45:02AM -0600, Gary Mills wrote:
> I've been noticing idle pop3d processes on our Cyrus front end server
> for some time.  These should be transient.  One that was several days
> old had an established TCP connection to a wireless client that had
> disappeared.  Presumably the client never closed the connection.
> 
> Setting TCP keepalive on the file descriptor should permit the kernel
> to close the connection in this situation.  Does this sound
> reasonable?  Perhaps it's already been addressed in a later Cyrus
> version.  We're running cyrus-imapd-2.3.8.
> 
> I'm willing to add a `keepalive' option to Cyrus master along with the
> setsockopt() system call to enable that setting.  This option could be
> added to the cyrus.conf file for any services that could benefit from
> it. Would this be a reasonable addition to Cyrus?

This is something I've been wanting to look in to as well.  If a machine
crashes, it can leave sync_clients hanging for ever, thinking they are
still talking to the replica - meaning that they hold the lock and
replication doesn't start back up.  Quite annoying.

Is there any reason to make it an option rather than just always having
it on?  Or at least not to make it the default.  If it's a good idea, it
SHOULD be the default.  I'm strongly against having hundreds of lines of
config file required to get the sanest defaults!

Hmm... oh:

http://tools.ietf.org/html/rfc1122#page-101

            Implementors MAY include "keep-alives" in their TCP
            implementations, although this practice is not universally
            accepted.  If keep-alives are included, the application MUST
            be able to turn them on or off for each TCP connection, and
            they MUST default to off.

            Keep-alive packets MUST only be sent when no data or
            acknowledgement packets have been received for the
            connection within an interval.  This interval MUST be
            configurable and MUST default to no less than two hours.

Sounds like it does need to be configurable!  Also, the defaults are
crazy high amounts for a local reliable network.  I think I'd prefer
about 5 minutes than 2 hours!

Bron.


More information about the Info-cyrus mailing list