terminated abnormally

Patrick Radtke phr2101 at columbia.edu
Thu Mar 30 14:16:04 EST 2006


On Mar 23, 2006, at 8:27 AM, Ken Murchison wrote:

> Patrick Radtke wrote:
>> Every day we get a number of 'terminated abnormally' messages on  
>> our frontends.
>> When they happen just occasionally then everything still seems to  
>> run fine.
>> However sometimes, there will be a large number of them in a short  
>> time span and this is a sure sign that people are having trouble  
>> logging in or are being disconnected.
>> here is examples from the log
>> Mar 22 10:50:07 tofu master[19506]: service pop3s pid 18311 in  
>> READY state: terminated abnormally
>> Mar 22 10:50:10 tofu master[19506]: service imaps pid 18128 in  
>> READY state: terminated abnormally
>> Mar 22 10:50:20 tofu master[19506]: service imaps pid 18348 in  
>> READY state: terminated abnormally
>> Mar 22 10:50:23 tofu master[19506]: service imaps pid 13063 in  
>> READY state: terminated abnormally
>> Mar 22 10:50:32 tofu master[19506]: service pop3s pid 18404 in  
>> READY state: terminated abnormally
>> Mar 22 10:50:33 tofu master[19506]: service pop3s pid 18218 in  
>> READY state: terminated abnormally
>> Mar 22 11:01:42 valoney master[2581]: service imaps pid 25617 in  
>> READY state: terminated abnormally
>> Mar 22 11:01:46 valoney master[2581]: service pop3s pid 25682 in  
>> READY state: terminated abnormally
>> Mar 22 11:01:47 valoney master[2581]: service imaps pid 25903 in  
>> READY state: terminated abnormally
>> searching on just a fail process id shows
>> Mar 22 11:01:38 valoney imaps[25903]: imaps failed: [local]
>> Mar 22 11:01:47 valoney master[2581]: service imaps pid 25903 in  
>> READY state: terminated abnormally
>> Mar 22 11:01:51 valoney master[2581]: service imaps pid 25903:  
>> while trying to process message 0x2: not registered yet
>> Mar 22 11:01:52 valoney master[2581]: service imaps pid 25903 in  
>> UNKNOWN state: processing message 0x2
>> ar 22 11:01:04 valoney pop3s[25440]: starttls: TLSv1 with cipher  
>> RC4-MD5 (128/128 bits new) no authentication
>> Mar 22 11:01:20 valoney pop3s[25440]: login:  
>> pool-141-155-147-109.ny5030.east.verizon.net [141.155.147.109]  
>> jg2378 plaintext+TLS User logged in
>> Mar 22 11:01:21 valoney pop3s[25440]: failed to bind to address  
>> 128.59.48.36: Cannot assign requested address
>> Mar 22 11:01:21 valoney pop3s[25440]: Doing a peer verify
>> Mar 22 11:01:21 valoney pop3s[25440]: Doing a peer verify
>> Mar 22 11:01:21 valoney pop3s[25440]: starttls: TLSv1 with cipher  
>> AES256-SHA (256/256 bits new) no authentication
>> Mar 22 11:01:36 valoney pop3s[25440]: pop3s failed: [local]
>> Mar 22 11:01:37 valoney pop3s[25440]: Fatal error:  
>> tls_start_servertls() failed
>> Mar 22 11:01:41 valoney master[2581]: service pop3s pid 25440 in  
>> READY state: terminated abnormally
>> Mar 22 11:01:52 valoney master[2581]: service pop3s pid 25440:  
>> while trying to process message 0x2: not registered yet
>> Mar 22 11:01:52 valoney master[2581]: service pop3s pid 25440 in  
>> UNKNOWN state: processing message 0x2
>
> I've never seen behavior like this before.  You have a pop3d which  
> looks like it successfully does TLS+PLAIN, but then attempts TLS  
> again.  What I don't know is if the client has disconnected in  
> between, or if there is some kind of race condition.


Ken,  is the second TLS just proxyd connecting to a backend?


Syslog on debug (with today's CVS cyrus 2.3 )shows:
Mar 30 13:15:05 hotdog imap[18602]: starttls: TLSv1 with cipher  
AES256-SHA (256/256 bits new) no authentication
Mar 30 13:15:07 hotdog imap[18602]: ptload(): fetched cache record  
(phr2101)(mark 1143741894, current 1143742507, limit 1143738907)
Mar 30 13:15:07 hotdog imap[18602]: ptload returning data
Mar 30 13:15:07 hotdog imap[18602]: canonified phr2101 -> phr2101
Mar 30 13:15:07 hotdog imap[18602]: login: asiago.cc.columbia.edu  
[128.59.59.74] phr2101 PLAIN+TLS User logged in
Mar 30 13:15:07 hotdog imap[18602]: Doing a peer verify
Mar 30 13:15:07 hotdog imap[18602]: Doing a peer verify
Mar 30 13:15:07 hotdog imap[18602]: received server certificate
Mar 30 13:15:07 hotdog imap[18602]: starttls: TLSv1 with cipher  
AES256-SHA (256/256 bits new) no authentication
Mar 30 13:15:07 hotdog imap[18602]: ptload(): fetched cache record  
(phr2101)(mark 1143741894, current 1143742507, limit 1143738907)
Mar 30 13:15:07 hotdog imap[18602]: ptload returning data
Mar 30 13:15:07 hotdog imap[18602]: canonified phr2101 -> phr2101
Mar 30 13:15:07 hotdog imap[18602]: ptload(): fetched cache record  
(murder)(mark 1143741895, current 1143742507, limit 1143738907)
Mar 30 13:15:07 hotdog imap[18602]: ptload returning data
Mar 30 13:15:07 hotdog imap[18602]: canonified murder -> murder
Mar 30 13:15:07 hotdog imap[18602]: open: user phr2101 opened INBOX  
on bratwurst.cc.columbia.edu


Proxyd is getting a server certificate which makes me think the 2nd  
tls is just encrypting the frontend to backend communication.  
Additionally it also looking up the 'murder' user which is the proxy  
user we use when frontends log into backends





Increasing the log level to debug shows a lot of
Mar 30 13:20:09 mockmeat master[23360]: service imaps pid 26574 in  
BUSY state: terminated abnormally

is that normal? (These ones are being terminated while BUSY instead  
of when READY). These logs are from 2.2.12

The BUSY ones look like
Mar 30 13:20:05 mockmeat imaps[26574]: accepted connection
Mar 30 13:20:05 mockmeat imaps[26574]: imaps failed: [local]
Mar 30 13:20:08 mockmeat master[23360]: process 26574 exited, status 75
Mar 30 13:20:09 mockmeat master[23360]: service imaps pid 26574 in  
BUSY state: terminated abnormally


here is a different  (different from my initial email) looking READY  
error
Mar 30 13:19:01 mockmeat master[1494]: about to exec /usr/bin/proxyd
Mar 30 13:19:02 mockmeat imaps[1494]: executed
Mar 30 13:19:55 mockmeat imaps[1494]: accepted connection
Mar 30 13:19:56 mockmeat imaps[1494]: imaps failed: [local]
Mar 30 13:19:59 mockmeat master[23360]: process 1494 exited, status 75
Mar 30 13:20:01 mockmeat master[23360]: service imaps pid 1494 in  
READY state: terminated abnormally

In these one is looks like something is failing in  
'tls_start_servertls'.

Does [local] just mean the client's host couldn't be determined?




Other strangeness I saw was
Mar 30 13:19:39 mockmeat master[23360]: service sieve pid 1496:  
changing from service pop3 due to received message


Mar 30 13:19:01 mockmeat master[1496]: about to exec /usr/bin/timsieved
Mar 30 13:19:02 mockmeat sieve[1496]: executed
Mar 30 13:19:03 mockmeat sieve[1496]: warning: can't get client  
address: Connection reset by peer
Mar 30 13:19:04 mockmeat sieve[1496]: accepted connection
Mar 30 13:19:05 mockmeat master[23360]: process 1496 exited, status 0
Mar 30 13:19:39 mockmeat master[23360]: service sieve pid 1496:  
changing from service pop3 due to received message

What's going on here? Master looks up the pid in ctable and finds  
that the service id doesn't match what it is expecting?

and finally

Mar 30 13:19:36 mockmeat master[23360]: imaps has -12 workers?!?

Looking through the code these last 2 sections are marked ' /*  
paranoia */'
I imagine this signifies non-normal behavior... but I'm not sure  
about causes.


Our load balancer (CSM) probes the frontends every 6 seconds or so to  
confirm that they are responding. The CSM makes a connection and then  
disconnects right away. Could this be triggering something in Cyrus?  
Its hard to think so since the errors generally are all during peak  
usage, and much less in the evening even though the CSM is always  
probing at the same rate.


-Patrick





More information about the Info-cyrus mailing list