Problems with Cyrus IMAP 2.2.5 on Solaris 9 on production server

Shawn Sivy ssivy at tcnj.edu
Wed Jun 2 10:54:19 EDT 2004


Igor Brezac wrote:

>On Wed, 2 Jun 2004, Shawn Sivy wrote:
>
>  
>
>>Igor Brezac wrote:
>>
>>    
>>
>>>On Tue, 1 Jun 2004, Shawn Sivy wrote:
>>>
>>>
>>>
>>>      
>>>
>>>>I'm having all kinds of problems with Cyrus IMAP 2.2.5 on Solaris 9.
>>>>System I/O errors, imap processing dying, IOERRORs.
>>>>
>>>>Does anyone have suggestions on what could be the cause?  Has anyone
>>>>gotten version 2.2.5 working on Solaris (SPARC) 9?
>>>>
>>>> -Shawn
>>>>
>>>>May 30 17:52:57 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/m/user.macey2: Too many open files
>>>>May 30 17:52:57 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.macey2: cyrusdb error
>>>>May 30 17:52:57 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.macey2) failed: System I/O error
>>>>May 30 17:53:20 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/s/user.sdhugg: Too many open files
>>>>May 30 17:53:20 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.sdhugg: cyrusdb error
>>>>May 30 17:53:20 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.sdhugg) failed: System I/O error
>>>>May 30 17:53:40 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/s/user.samuel2: Too many open files
>>>>May 30 17:53:40 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.samuel2: cyrusdb error
>>>>May 30 17:53:40 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.samuel2) failed: System I/O error
>>>>May 30 17:54:05 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/b/user.balaisi2: Too many open files
>>>>May 30 17:54:05 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.balaisi2: cyrusdb error
>>>>May 30 17:54:05 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.balaisi2) failed: System I/O error
>>>>
>>>>Jun  1 08:42:19 cyrus master[21185]: [ID 970914 local6.error] process
>>>>21886 exited, signaled to death by 11
>>>>Jun  1 08:43:26 cyrus master[21185]: [ID 970914 local6.error] process
>>>>20660 exited, signaled to death by 11
>>>>Jun  1 08:43:43 cyrus master[21185]: [ID 970914 local6.error] process
>>>>20133 exited, signaled to death by 11
>>>>Jun  1 08:47:02 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23236 exited, signaled to death by 11
>>>>Jun  1 08:47:20 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23972 exited, signaled to death by 11
>>>>Jun  1 08:47:58 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23751 exited, signaled to death by 11
>>>>Jun  1 08:48:05 cyrus master[21185]: [ID 970914 local6.error] process
>>>>21258 exited, signaled to death by 11
>>>>Jun  1 08:49:53 cyrus master[21185]: [ID 970914 local6.error] process
>>>>19939 exited, signaled to death by 11
>>>>Jun  1 08:51:27 cyrus master[21185]: [ID 970914 local6.error] process
>>>>24807 exited, signaled to death by 11
>>>>Jun  1 08:51:37 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23457 exited, signaled to death by 11
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>This looks like a berkeley db problem, although it could be a file
>>>descriptor leak somewhere. Have you applied sleepycat 4.2.52 patches
>>>(there are two of them, although the first is more important)?  Does
>>>checkpointing of the cyrus databases complete successfully (look for
>>>ctl_cyrusdb in the syslog)  I start master from 'configdirectory',
>>>otherwise berkeley checkpointing does not work (neither does duplicate
>>>db expiration)
>>>
>>>
>>>
>>>      
>>>
>>I have both patches installed for db 4.2.52.  Below are the messages
>>from the log regarding ctl_cyrusdb.  Looks like it completed fine.  I
>>took your suggestion of starting master from /var/imap.
>>
>>Jun  2 08:47:05 cyrus master[17927]: [ID 392559 local6.debug] about to
>>exec /local/cyrus/bin/ctl_cyrusdb
>>Jun  2 08:47:06 cyrus ctl_cyrusdb[17927]: [ID 702911 local6.notice]
>>recovering cyrus databases
>>Jun  2 08:47:09 cyrus ctl_cyrusdb[17927]: [ID 275131 local6.notice]
>>skiplist: recovered /var/imap/mailboxes.db (85526 records, 6516904
>>bytes) in 3 seconds
>>Jun  2 08:47:13 cyrus ctl_cyrusdb[17927]: [ID 127214 local6.notice] done
>>recovering cyrus databases
>>Jun  2 08:47:13 cyrus master[17935]: [ID 392559 local6.debug] about to
>>exec /local/cyrus/bin/ctl_cyrusdb
>>Jun  2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 702911 local6.notice]
>>checkpointing cyrus databases
>>Jun  2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 578205 local6.debug]
>>archiving database file: /var/imap/mailboxes.db
>>Jun  2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 448116 local6.debug]
>>archiving log file: /var/imap/db/log.0000000008
>>Jun  2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 578205 local6.debug]
>>archiving database file: /var/imap/annotations.db
>>Jun  2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 448116 local6.debug]
>>archiving log file: /var/imap/db/log.0000000008
>>Jun  2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 127214 local6.notice] done
>>checkpointing cyrus databases
>>
>>
>>    
>>
>
>Have things improved since you restarted master?
>
>You can use pfiles and pmap (and lsof) to check for open files and memory
>usage.  Try to use pfile against a running imapd process and see if a
>number of open files increases.
>
>What does ulimit -a say?
>
>  
>
cyrus# ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        8192
coredump(blocks)     unlimited
nofiles(descriptors) 1024
vmemory(kbytes)      unlimited

I played around with using skiplist for quota_db last night, but 
couldn't get it to work, setting/getting quotas just hung (cyradm).  
After I moved back to quotalegacy and copied back the previous quota 
files, I haven't seen the System I/O errors to "Too many files" message 
since, however, if it is a descriptor leak as Ken suggested, it may take 
a while to show itself.

I ran pmap and pfiles against an imapd and lmptd process.  I'm not sure 
exactly how to interpret the output, but nothing seemed excessive.  The 
one thing I notice is that the rlimit of the process is 256 eventhough 
the system-wide default limit ... not the max (set in /etc/system) is 
1024.  I though I saw that that 256 was a 32-bit app limit (at least on 
Solaris), but I'm not sure.  The cyrus code is currently all compiled at 
32-bit, not 64-bit.

Are your running version 2.2.5 or an earlier version (like 2.2.3) of 
imap?  I may move back to 2.2.3 later today at Ken's suggestion.

  -Shawn




---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html




More information about the Info-cyrus mailing list