One more attempt: stuck processes

Gary Mills mills at cc.umanitoba.ca
Fri Nov 16 09:00:07 EST 2007


On Fri, Nov 16, 2007 at 01:54:24PM +0100, Alain Spineux wrote:
> On Nov 16, 2007 12:36 PM, Sebastian Hagedorn <Hagedorn at uni-koeln.de> wrote:
> > --On 16. November 2007 11:27:09 +0100 Sebastian Hagedorn
> > <hagedorn at uni-koeln.de> wrote:
> >
> > 1. In the absence of the SO_KEEPALIVE option it is entirely possible that a
> > TCP connection remains ESTABLISHED even when the other side has gone.
> 
> I said that socket should timeout, but this is true only when the
> protocol (TCP here)
> require a response (usualy AK here) or at connection establishement.
> On the contrary
> it should stay open indefinitely util something happens. Router doing
> NAT can drop
> a too old connection, because it has to maintains a NAT table and make some
> cleanup time to time, this where "KEEPALIVE" become usefull.
> 
> > This may not be a solution to this particular problem, but it made me
> > wonder why Cyrus does *not* use SO_KEEPALIVE. Is there a downside to it?
> 
> Cyrus has already a built-in time out, it seems a lite conflicting to actively
> maintains the connection until it drop it itself !
> This is the works of the client to actively maintains the connection,
> if it want it !

This timeout doesn't work in some cases.  We have lots of POP sessions
that never terminate.  About 30 out of 40 are in that state now.
Here's an example:

   cyrus 13075   708  0   Oct 14 ?        0:05 pop3d -s
   cyrus 20023   708  0   Oct 29 ?        0:00 pop3d
   cyrus 24560   708  1 07:38:03 ?        0:03 pop3d
   cyrus   631   708  0   Oct 03 ?        0:10 pop3d -s
   cyrus  6786   708  0   Oct 20 ?        0:00 pop3d -s
   cyrus 29777   708  0 07:45:03 ?        0:00 pop3d
   cyrus 19175   708  0   Oct 04 ?        0:04 pop3d -s

One I just checked is stuck in a read():

  # truss -p 19175
  read(0, 0x002316F0, 5)          (sleeping...)
  ^?# pfiles 19175
  19175:  pop3d -s
    Current rlimit: 256 file descriptors
     0: S_IFSOCK mode:0666 dev:271,0 ino:25813 uid:0 gid:0 size:0
        O_RDWR
          sockname: AF_INET 130.179.16.23  port: 995
          peername: AF_INET 130.179.188.184  port: 51771

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-


More information about the Info-cyrus mailing list