[RFC PATCH v2] imapd.c: imapoptions: implement idle timeout

Tue Sep 20 00:30:22 EDT 2016

On Tue, Sep 20, 2016, at 01:30 PM, Robert Mueller via Cyrus-devel wrote:
> Is there a reason to have a separate idle timeout separate to the
> standard inactivity timeout?
> 
>           timeout: 32
>               The length of the IMAP server's inactivity autologout
>               timer, in minutes.  The minimum value is 30, the default. 
>               The default is 32 to allow a bit of leeway for clients
>               that try to NOOP  every
>               30 minutes.

The timeout controlled by the "timeout" setting is checked when reading
from the stream (lib/prot.c).  This is fine in the command loop, because
when we're not doing anything else, we're blocked in a read waiting for
the next command.

The thing is, during IDLE, we're not blocked on a read, we're blocked on
a select (in idle_wait()).  We don't even try to read from the client
until the select tells us it's readable. If the client has dropped out,
select never tells us it's readable, so we never try to read, so the
timeout doesn't get checked, so we stay there indefinitely.

> The minimum value is 30, the default. 
> The default is 32 

Ummmmmmmmmm... documentation bug!

> Anyway I'm not sure why this is required now. I'm pretty sure (from
> memory) in older versions of cyrus, the IDLE timeout was the standard
> inactivity timeout (30 minutes default), and after that time we'd always
> drop the connection. IDLE RFC even mentions that clients should DONE and
> re-IDLE every 29 minutes.

Looks like older (gitk guesses 2.4.2 and earlier?) versions of cyrus
handled IDLE by blocking in a read (waiting for the "DONE"
continuation), and so it would have been affected by the "timeout"
timeout.  Comms between idled and imapd were handled with signals.

This was changed to use a unix socket rather than signals for ipc
(https://git.io/viyqN), and so imapd switched from blocking on the
client socket to blocking on a select of the client socket and the idled
ipc socket.  And this edge case turned up.

So now I'm thinking, instead of adding a new "imapidletimeout" setting,
let's make exactly the same logical changes but using the existing
"timeout" setting instead?  In which case I can legitimately call it a
bug fix, and back port it to 2.5.  Disadvantage: it's in minutes, which
inhibits test automation.

Cheers,

ellie

On Tue, Sep 20, 2016, at 01:30 PM, Robert Mueller via Cyrus-devel wrote:
> 
> > If the client's connection has dropped out, no data will ever appear on
> > the socket, so select will never flag it as readable, so we will never
> > try to read from it, so we will never receive the read error even though
> > tcp_keepalive detected the dropout.  And if this client was idling with
> > a low-activity mailbox selected (such as Drafts or Sent), it might be a
> > very long time before any activity prompts us to write to the socket, so
> > we also don't receive the write error.  And so even though the socket
> > itself knows there's no connection anymore thanks to tcp_keepalive, we
> > don't know that, because we haven't tried to interact with it.  And so
> > the connection/process doesn't get cleaned up.
> 
> This is actually one of the main points of tcp_keepalive. If you're
> trying to read() from a socket where the other end has "died", the read
> blocks *forever*.
> 
> However with tcp_keepalive enabled, the server eventually pings the
> other side, and after it doesn't get a response (after the configured
> backoff and retry times), it flags the connection as EOF, causing the
> read() to return with 0 bytes as per standard EOF/shutdown semantics.
> 
> Anyway I'm not sure why this is required now. I'm pretty sure (from
> memory) in older versions of cyrus, the IDLE timeout was the standard
> inactivity timeout (30 minutes default), and after that time we'd always
> drop the connection. IDLE RFC even mentions that clients should DONE and
> re-IDLE every 29 minutes.
> 
> https://tools.ietf.org/html/rfc3501
> 
> 5.4.    Autologout Timer
> 
>    If a server has an inactivity autologout timer, the duration of that
>    timer MUST be at least 30 minutes.  The receipt of ANY command from
>    the client during that interval SHOULD suffice to reset the
>    autologout timer.
> 
> 
> https://tools.ietf.org/html/rfc2177
> 
>    The server MAY consider a client inactive if it has an IDLE command
>    running, and if such a server has an inactivity timeout it MAY log
>    the client off implicitly at the end of its timeout period.  Because
>    of that, clients using IDLE are advised to terminate the IDLE and
>    re-issue it at least every 29 minutes to avoid being logged off.
>    This still allows a client to receive immediate mailbox updates even
>    though it need only "poll" at half hour intervals.
> 
> 
> Is there a reason to have a separate idle timeout separate to the
> standard inactivity timeout?
> 
>           timeout: 32
>               The length of the IMAP server's inactivity autologout
>               timer, in minutes.  The minimum value is 30, the default. 
>               The default is 32 to allow a bit of leeway for clients
>               that try to NOOP  every
>               30 minutes.
> 
> -- 
> Rob Mueller
> robm at fastmail.fm