Experiment to test TCP keepalive for pop3d proxies

Gary Mills mills at cc.umanitoba.ca
Fri May 28 12:42:45 EDT 2010

On Thu, May 27, 2010 at 08:52:18PM -0400, Wesley Craig wrote:
> For your problem, pop3d calls:
> 	prot_settimeout(popd_in, popd_timeout);
> just below where you've inserted the KEEPALIVE.  What do you have  
> poptimeout set to?

It's set to 20 minutes.

> I wouldn't be surprised by a bug in prot, BTW.   
> I'm pretty sure I've seen a case where select() is used to implement  
> the timeout but once there's *some* input, read() is called with  
> blocking (wrong!).
> In any case, if you can get a traceback with gdb for some hung  
> pop3d's, I'm sure we can pinpoint the issue.

Of course, we don't have those anymore.  TCP keepalive cleans them up.
I did find one that's been present for some time and seems idle.  It
does have an established TCP connection to a remote client.  Here's a
stack trace on it:

# pstack 5432
5432:   pop3d -s
 feb1a465 pollsys  (8042da0, 2, 8042e60, 0)
 feac3b8a pselect  (d, 8042eb4, feb90318, feb90318, 8042e60, 0) + 18e
 feac3e80 select   (d, 8042eb4, 0, 0, 8042ea8, 0) + 82
 0808981b prot_select (815d168, ffffffff, 8043f94, 0, 8042ea8, 0) + 44b
 0805e4ee proxy_check_input (815d168, 81a7228, 819e520, 81a3d60, 81a7700, 0) + 5e
 0805dd74 bitpipe  (8145c08, 0, feb921ec, 0, 8044fed, 8044fed) + c4
 0805acb7 cmdloop  (8135594, 8138980, 14, 2, 32203832, 31312e34) + 27
 0805aa53 service_main (2, 8142a50, 8047db4) + 473
 08062c13 main     (2, 8047da8, 8047db4, 8047d9c) + a83
 08059bbd _start   (2, 8047e54, 8047e5a, 0, 8047e5d, 8047e68) + 7d

It just vanished.  TCP keepalive must have gotten it.

-Gary Mills-        -Unix Group-        -Computer and Network Services-

