One more attempt: stuck processes

Ken Murchison murch at andrew.cmu.edu
Fri Nov 16 15:54:50 EST 2007


Sebastian Hagedorn wrote:
> -- Ken Murchison <murch at andrew.cmu.edu> is rumored to have mumbled on 
> 16. November 2007 12:58:49 -0500 regarding Re: One more attempt: stuck 
> processes:
> 
>>> So should I add a call to ERR_get_error()?
>>
>>
>> Not yet.  I'm assuming that none of these processes has hung.  We're
>> getting an I/O error most likely because the client has closed the
>> connection immediately after sending QUIT.  This is harmless.
>>
>> What I really want to see is if we get a SSL_ERROR_WANT_xxx return code
>> when we're hung.
> 
> I have both good and bad news. Bad news first: there is a stuck process 
> that did *not* log that SSL_read line.
> 
> Good news: the binary I'm running now isn't stripped and has much more 
> detail in its stack trace:
> 
> (gdb) bt
> #0  0x003d341e in __read_nocancel () from /lib/tls/libc.so.6
> #1  0x0017f2f7 in BIO_new_socket () from /lib/libcrypto.so.4
> #2  0x0017d2b2 in BIO_read () from /lib/libcrypto.so.4
> #3  0x0089ec30 in ssl3_alert_code () from /lib/libssl.so.4
> #4  0x0089edcc in ssl3_alert_code () from /lib/libssl.so.4
> #5  0x008a00cf in ssl3_read_bytes () from /lib/libssl.so.4
> #6  0x008a0ffc in ssl3_get_message () from /lib/libssl.so.4
> #7  0x00896cab in ssl3_accept () from /lib/libssl.so.4
> #8  0x00896944 in ssl3_accept () from /lib/libssl.so.4
> #9  0x008a5c9a in SSL_accept () from /lib/libssl.so.4
> #10 0x008a180d in ssl23_get_client_hello () from /lib/libssl.so.4
> #11 0x008a1712 in ssl23_accept () from /lib/libssl.so.4
> #12 0x008a5c9a in SSL_accept () from /lib/libssl.so.4
> #13 0x08052cf3 in tls_start_servertls (readfd=-512, writefd=-512, 
> layerbits=0xbfff7a78, authid=0xbfff7a74,
>    ret=0x810bca0) at tls.c:803
> #14 0x0804e553 in cmd_starttls (pop3s=1) at pop3d.c:1076
> #15 0x0804d5cc in service_main (argc=2, argv=0x9e84008, envp=0xbfff9850) 
> at pop3d.c:537
> #16 0x08054550 in main (argc=2, argv=0x9, envp=0xbfff9850) at service.c:539

That's exactly what Gary is seeing.  Its blocking in SSL_accept(). 
Apparently the client connects to port 995, and then either sends 
nothing, or goes away and leaves the socket open.

I've reproduced the former by telneting to port 995 and doing nothing. 
I have been unable to reproduce the latter because as soon as I QUIT the 
telnet session or kill() the telnet process, pop3d exits gracefully.


-- 
Kenneth Murchison
Systems Programmer
Project Cyrus Developer/Maintainer
Carnegie Mellon University


More information about the Info-cyrus mailing list