cyrus master fails with status 71

Eric Cunningham eric at whoi.edu
Mon Nov 7 13:01:29 EST 2016


Hi Ellie, we've been running with your patch since Oct 25 and haven't 
encountered any issues with imapd exiting, thus far.  But, now that 
imapd has had a chance to run uninterrupted for almost 2 weeks, the 
number of imapd processes/connections has steadily climbed every day. 
This morning, it was near 16,000.  This system has a total of 1400 accounts.

To try and control this growth, per 
https://cyrusimap.org/imap/faqs/o-toomanyprocesses.html I've set the 
following:

To cyrus.conf, added "-U 50" option to the SERVICES section for imapd:

   imap	cmd="imapd -U 50"	listen="imap"	prefork=60
   imaps	cmd="imapd -s -U 50" 	listen="imaps" 	prefork=150


To imapd.conf, added the following tcp_keepalive options:

   tcp_keepalive: 1
   tcp_keepalive_cnt: 1
   tcp_keepalive_idle: 30
   tcp_keepalive_intvl: 900

After restarting imapd, the following are now being logged repeatedly:

Nov  7 10:18:19 imap1 lmtpunix[58768]: unable to 
setsocketopt(TCP_KEEPCNT): Invalid argument
Nov  7 10:18:19 imap1 lmtpunix[58768]: unable to 
setsocketopt(TCP_KEEPIDLE): Invalid argument
Nov  7 10:18:19 imap1 lmtpunix[58768]: unable to 
setsocketopt(TCP_KEEPINTVL): Invalid argument


So, a couple of questions for the list:

Are such numbers of imapd processes to be expected?
Why is lmptunix complaining about options passed to imapd?

Thank you.

-Eric


On 10/25/16 8:23 PM, ellie timoney via Info-cyrus wrote:
> Hi Eric,
>
> Patch attached.  I'd appreciate if you could advise whether this helps.
> Though I guess you won't be able to tell for a couple of weeks.
>
> If it doesn't cause any new problems (I don't expect it to), then it
> will be included in 2.5.11 (whenever that comes out).
>
> Cheers,
>
> ellie
>
> On Wed, Oct 26, 2016, at 10:04 AM, ellie timoney via Info-cyrus wrote:
>>> accept failed: Software caused connection abort
>>
>> Some sleuthing suggests that "Software caused connection abort"
>> corresponds with "ECONNABORTED".
>>
>> The man page on my system for accept(2) unhelpfully defines this as:
>>
>>>        ECONNABORTED
>>>               A connection has been aborted.
>>
>> But some digging around online suggests that this situation occurs when
>> a client connects, but subsequently disconnects  (RST) before the server
>> gets around to accept()ing the connection.  When the server does
>> eventually accept(), the accept() fails with this error.
>>
>> Which sounds to me like we want to treat ECONNABORTED similarly to
>> EAGAIN, not as a fatal OS error.  I'll have a patch up for this shortly.
>>
>> Cheers,
>>
>> ellie
>>
>> On Wed, Oct 26, 2016, at 09:27 AM, Eric Cunningham via Info-cyrus wrote:
>>> Having repeatedly experienced the "status 71" issue, I've been
>>> incrementally bumping it's value up.  It's currently set to 32768 (!)
>>> and that value was in place when it most recently failed.
>>>
>>>
>>> On 10/25/16 4:21 PM, Shawn Bakhtiar via Info-cyrus wrote:
>>>> Hmmmm.. if that’s the case could you be hitting the the maximum number
>>>> of accepts??
>>>>
>>>> Check the 11.11.1.2. kern.ipc.soacceptqueue section of the FreeBSD handbook
>>>>
>>>> https://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html
>>>>
>>>> Given the load you described perhaps 128 is just not enough?
>>>>
>>>>
>>>>
>>>>> On Oct 24, 2016, at 1:22 PM, Eric Cunningham via Info-cyrus
>>>>> <info-cyrus at lists.andrew.cmu.edu
>>>>> <mailto:info-cyrus at lists.andrew.cmu.edu>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> =============================================================
>>>>>  Eric Cunningham
>>>>>  Information Services - http://whoi-it.whoi.edu
>>>>>  Woods Hole Oceanographic Institution - http://www.whoi.edu
>>>>>  Woods Hole, MA  02543-1541     phone: (508) 289-2224
>>>>>  fax: (508) 457-2174           e-mail: ecunningham at whoi.edu
>>>>> <mailto:ecunningham at whoi.edu>
>>>>> =============================================================
>>>>>
>>>>> On 10/24/2016 03:45 PM, Bron Gondwana via Info-cyrus wrote:
>>>>>> On Tue, 25 Oct 2016, at 02:45, Eric Cunningham via Info-cyrus wrote:
>>>>>>> Hi list, we're running cyrus imap 2.5.9 built from the FreeBSD 10-2
>>>>>>> (release-p7) ports tree.
>>>>>>>
>>>>>>> The cyrus master process is failing periodically (every 1-2 weeks) as
>>>>>>> follows:
>>>>>>>
>>>>>>> Oct 22 07:38:48 imap1 master[7767]: process type:SERVICE name:imaps
>>>>>>> path:/usr/local/cyrus/bin/imapd age:305.215s pid:32760 exited, status 71
>>>>>>> Oct 22 07:38:48 imap1 master[7767]: service imaps/ipv4 pid 32760 in
>>>>>>> READY state: terminated abnormally
>>>>>>> Oct 22 07:38:48 imap1 master[7767]: too many failures for service
>>>>>>> imaps/ipv4, disabling until next SIGHUP
>>>>>>>
>>>>>>> This prevents new connections by clients until cyrus is restarted.  I've
>>>>>>> looked around the web but have not seen this issue reported.
>>>>>>>
>>>>>>> A little background:
>>>>>>>
>>>>>>> Our initial thought on this was that we were running out of listen
>>>>>>> queues so have upped that incrementally from the default of 32 to a
>>>>>>> current setting of 32768 via /usr/local/etc/rc.d/imapd using the -l
>>>>>>> option, with increased kern.ipc.soacceptqueue set to 32768, but that
>>>>>>> hasn't helped.  Sometimes the "status 71" occurs during periods of light
>>>>>>> use during off hours, like on Saturday mornings.
>>>>>>>
>>>>>>> We have ~1400 imap accounts, though the number of impad processes hovers
>>>>>>> around 3,000-4,000.  There have been spikes observed as high as 12,000
>>>>>>> imapd processes.  In that particular case, 1 user had 2 imap clients
>>>>>>> accounting for near 6,000 of those connections.  We've attempted to
>>>>>>> limit these high numbers using the following imapd.conf values:
>>>>>>>
>>>>>>> maxlogins_per_host: 50
>>>>>>> maxlogins_per_user: 30
>>>>>>> tcp_keepalive: 1
>>>>>>> tcp_keepalive_cnt: 1
>>>>>>> tcp_keepalive_idle: 30
>>>>>>> tcp_keepalive_intvl: 900
>>>>>>>
>>>>>>> However, it seems that once these were reached, no new connections were
>>>>>>> permitted and resulted in all manner of user complaints about not being
>>>>>>> able to get at their email.
>>>>>>>
>>>>>>> Any ideas on this "status 71" issue?  Could an upgrade to 2.5.10
>>>>>>> possibly address this?  Thanks!
>>>>>>
>>>>>> https://www.freebsd.org/cgi/man.cgi?query=sysexits
>>>>>>
>>>>>>     EX_OSERR (71)         An operating system error has been
>>>>>> detected.  This
>>>>>>                           is intended to be used for such things as
>>>>>> ``cannot
>>>>>>                           fork'', ``cannot create pipe'', or the
>>>>>> like.  It
>>>>>>                           includes things like getuid returning a
>>>>>> user that
>>>>>>                           does not exist in the passwd file.
>>>>>>
>>>>>> So the question is: what failed?  Is there anything earlier in the
>>>>>> log to suggest
>>>>>> what the imapd was doing when it died?
>>>>>>
>>>>>> Bron.
>>>>>>
>>>>>
>>>>> Using the example I posted, I traced back imaps process id 32760 and
>>>>> found only this:
>>>>>
>>>>> Oct 22 07:38:48 imap1 imaps[32760]: accept failed: Software caused
>>>>> connection abort
>>>>>
>>>>> -Eric
>>>>>
>>>>> ----
>>>>> Cyrus Home Page: http://www.cyrusimap.org/
>>>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>>>> To Unsubscribe:
>>>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>>>
>>>>
>>>>
>>>> ----
>>>> Cyrus Home Page: http://www.cyrusimap.org/
>>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>>> To Unsubscribe:
>>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>>>
>>>
>>> ----
>>> Cyrus Home Page: http://www.cyrusimap.org/
>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>> To Unsubscribe:
>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>> ----
>> Cyrus Home Page: http://www.cyrusimap.org/
>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>> To Unsubscribe:
>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>
>>
>> ----
>> Cyrus Home Page: http://www.cyrusimap.org/
>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>> To Unsubscribe:
>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus



More information about the Info-cyrus mailing list