One more attempt: stuck processes

Alain Spineux aspineux at gmail.com
Fri Nov 16 12:24:16 EST 2007


Hi

Can I resume the problem in :

The server is blocked in a read, waiting for the client next command.
(this is normal,
99% of the process are in this state). But the autologout procedure is
not working!

Then this means the SIGALRM that should awake the process never come or is not
handled properly! I simple call to sleep() or signal() could disturb this.
If this append only when using SSL, maybe the problem is here and the
ALRM should
bne reloaded somewhere.

This is useless now, but files in $cyrus_imap/proc/* contains the user
and the selected mailbox
of all these processes this could be useful to know if this what not
always the same user at the
origin of the problem, because he was using an old outlook or something.

Regards

On Nov 16, 2007 5:33 PM, Ken Murchison <murch at andrew.cmu.edu> wrote:
>
> Sebastian Hagedorn wrote:
> > --On 16. November 2007 09:37:42 -0600 Gary Mills <mills at cc.umanitoba.ca>
> > wrote:
> >
> >>> Could you get a stack trace? If you have gdb you just call it with "gdb
> >>> -p  19175". Then you can do "bt" at the prompt. I forget how to do it
> >>> with  Sun's debugger.
> >>
> >> Easy:
> >>
> >>   # pstack 19175
> >>   19175:  pop3d -s
> >>    fef9f810 read     (0, 2316f0, 5)
> >>    fee1d2d0 read     (0, 2316f0, 5, 0, 0, 0) + 5c
> >>    ff06bb38 sock_read (1f0860, 2316f0, 5, 5, 0, 0) + 24
> >>    ff068af0 BIO_read (1f0860, 2316f0, 5, fef98b84, 0, 0) + 110
> >>    ff278488 ssl3_read_n (212798, 5, 8805, 0, 0, 203958) + 174
> >>    ff2785fc ssl3_get_record (204ce0, 8000, 8400, 4400, f1, f0) + d0
> >>    ff279424 ssl3_read_bytes (212798, 1000, 2000, 4, 0, ffbfe731) + 228
> >>    ff27a99c ssl3_get_message (ff2a259c, 2070a0, 0, ffffffff, 19000,
> >> ffbfe7a0) + d0 ff27042c ssl3_accept (2150, 2160, 2180, 21e0, 2110, 2122)
> >> + 904    ff27bd2c ssl23_get_client_hello (2316fb, 6c, 6c, 4, fffffe79, 0)
> >> + 828    ff27b4b4 ssl23_accept (4000, 2000, 0, 0, 0, 0) + 2a4
> >>    00032d00 tls_start_servertls (0, 1, ffbfee24, ffbfee20, 1849a8, ff00)
> >> + 198    0002c504 cmd_starttls (1, 1fd8b8, 0, 0, 0, 0) + 184
> >>    0002a638 service_main (2, 192198, ffbffce0, 1aec4, 3508c, 1) + 488
> >>    00035250 main     (2, ffbffcd4, ffbffce0, 17c400, 0, 0) + e18
> >>    00029298 _start   (0, 0, 0, 0, 0, 0) + 108
> >
> > Thanks, that looks like progress! That stack trace looks similar enough
> > to the one I'm seeing that I could imagine that it is what I *should* be
> > seeing if the stack weren't garbled. Of course that's only speculation.
> >
> > Ken, is it possible that the call to SSL_accept() in
> > tls_start_servertls() blocks when the client goes away? That could
> > explain everything ....
>
> Yes.  Gary's problem might be very similar to yours, depending on what I
> see from the patch that I just sent you.
>
> --
> Kenneth Murchison
> Systems Programmer
> Project Cyrus Developer/Maintainer
> Carnegie Mellon University
> ----
>
> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
>



-- 
Alain Spineux
aspineux gmail com
May the sources be with you


More information about the Info-cyrus mailing list