Problems with Cyrus IMAP 2.2.5 on Solaris 9 on production server
Shawn Sivy
ssivy at tcnj.edu
Wed Jun 2 10:54:19 EDT 2004
Igor Brezac wrote:
>On Wed, 2 Jun 2004, Shawn Sivy wrote:
>
>
>
>>Igor Brezac wrote:
>>
>>
>>
>>>On Tue, 1 Jun 2004, Shawn Sivy wrote:
>>>
>>>
>>>
>>>
>>>
>>>>I'm having all kinds of problems with Cyrus IMAP 2.2.5 on Solaris 9.
>>>>System I/O errors, imap processing dying, IOERRORs.
>>>>
>>>>Does anyone have suggestions on what could be the cause? Has anyone
>>>>gotten version 2.2.5 working on Solaris (SPARC) 9?
>>>>
>>>> -Shawn
>>>>
>>>>May 30 17:52:57 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/m/user.macey2: Too many open files
>>>>May 30 17:52:57 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.macey2: cyrusdb error
>>>>May 30 17:52:57 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.macey2) failed: System I/O error
>>>>May 30 17:53:20 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/s/user.sdhugg: Too many open files
>>>>May 30 17:53:20 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.sdhugg: cyrusdb error
>>>>May 30 17:53:20 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.sdhugg) failed: System I/O error
>>>>May 30 17:53:40 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/s/user.samuel2: Too many open files
>>>>May 30 17:53:40 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.samuel2: cyrusdb error
>>>>May 30 17:53:40 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.samuel2) failed: System I/O error
>>>>May 30 17:54:05 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR:
>>>>opening quota file /var/imap/quota/b/user.balaisi2: Too many open files
>>>>May 30 17:54:05 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR:
>>>>error fetching user.balaisi2: cyrusdb error
>>>>May 30 17:54:05 cyrus lmtpunix[14954]: [ID 860734 local6.debug]
>>>>verify_user(user.balaisi2) failed: System I/O error
>>>>
>>>>Jun 1 08:42:19 cyrus master[21185]: [ID 970914 local6.error] process
>>>>21886 exited, signaled to death by 11
>>>>Jun 1 08:43:26 cyrus master[21185]: [ID 970914 local6.error] process
>>>>20660 exited, signaled to death by 11
>>>>Jun 1 08:43:43 cyrus master[21185]: [ID 970914 local6.error] process
>>>>20133 exited, signaled to death by 11
>>>>Jun 1 08:47:02 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23236 exited, signaled to death by 11
>>>>Jun 1 08:47:20 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23972 exited, signaled to death by 11
>>>>Jun 1 08:47:58 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23751 exited, signaled to death by 11
>>>>Jun 1 08:48:05 cyrus master[21185]: [ID 970914 local6.error] process
>>>>21258 exited, signaled to death by 11
>>>>Jun 1 08:49:53 cyrus master[21185]: [ID 970914 local6.error] process
>>>>19939 exited, signaled to death by 11
>>>>Jun 1 08:51:27 cyrus master[21185]: [ID 970914 local6.error] process
>>>>24807 exited, signaled to death by 11
>>>>Jun 1 08:51:37 cyrus master[21185]: [ID 970914 local6.error] process
>>>>23457 exited, signaled to death by 11
>>>>
>>>>
>>>>
>>>>
>>>>
>>>This looks like a berkeley db problem, although it could be a file
>>>descriptor leak somewhere. Have you applied sleepycat 4.2.52 patches
>>>(there are two of them, although the first is more important)? Does
>>>checkpointing of the cyrus databases complete successfully (look for
>>>ctl_cyrusdb in the syslog) I start master from 'configdirectory',
>>>otherwise berkeley checkpointing does not work (neither does duplicate
>>>db expiration)
>>>
>>>
>>>
>>>
>>>
>>I have both patches installed for db 4.2.52. Below are the messages
>>from the log regarding ctl_cyrusdb. Looks like it completed fine. I
>>took your suggestion of starting master from /var/imap.
>>
>>Jun 2 08:47:05 cyrus master[17927]: [ID 392559 local6.debug] about to
>>exec /local/cyrus/bin/ctl_cyrusdb
>>Jun 2 08:47:06 cyrus ctl_cyrusdb[17927]: [ID 702911 local6.notice]
>>recovering cyrus databases
>>Jun 2 08:47:09 cyrus ctl_cyrusdb[17927]: [ID 275131 local6.notice]
>>skiplist: recovered /var/imap/mailboxes.db (85526 records, 6516904
>>bytes) in 3 seconds
>>Jun 2 08:47:13 cyrus ctl_cyrusdb[17927]: [ID 127214 local6.notice] done
>>recovering cyrus databases
>>Jun 2 08:47:13 cyrus master[17935]: [ID 392559 local6.debug] about to
>>exec /local/cyrus/bin/ctl_cyrusdb
>>Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 702911 local6.notice]
>>checkpointing cyrus databases
>>Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 578205 local6.debug]
>>archiving database file: /var/imap/mailboxes.db
>>Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 448116 local6.debug]
>>archiving log file: /var/imap/db/log.0000000008
>>Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 578205 local6.debug]
>>archiving database file: /var/imap/annotations.db
>>Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 448116 local6.debug]
>>archiving log file: /var/imap/db/log.0000000008
>>Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 127214 local6.notice] done
>>checkpointing cyrus databases
>>
>>
>>
>>
>
>Have things improved since you restarted master?
>
>You can use pfiles and pmap (and lsof) to check for open files and memory
>usage. Try to use pfile against a running imapd process and see if a
>number of open files increases.
>
>What does ulimit -a say?
>
>
>
cyrus# ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 1024
vmemory(kbytes) unlimited
I played around with using skiplist for quota_db last night, but
couldn't get it to work, setting/getting quotas just hung (cyradm).
After I moved back to quotalegacy and copied back the previous quota
files, I haven't seen the System I/O errors to "Too many files" message
since, however, if it is a descriptor leak as Ken suggested, it may take
a while to show itself.
I ran pmap and pfiles against an imapd and lmptd process. I'm not sure
exactly how to interpret the output, but nothing seemed excessive. The
one thing I notice is that the rlimit of the process is 256 eventhough
the system-wide default limit ... not the max (set in /etc/system) is
1024. I though I saw that that 256 was a 32-bit app limit (at least on
Solaris), but I'm not sure. The cyrus code is currently all compiled at
32-bit, not 64-bit.
Are your running version 2.2.5 or an earlier version (like 2.2.3) of
imap? I may move back to 2.2.3 later today at Ken's suggestion.
-Shawn
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
More information about the Info-cyrus
mailing list