CPU usage issue with Append command (cyrus 3.0.6)
andrew-lists at quadrant.net
Wed May 9 12:13:59 EDT 2018
> Date: Wed, 9 May 2018 08:27:33 -0400
> From: Ben Carter <bhc at pitt.edu>
> For starters:
> strace one of the processes & see if it's making any system calls at all.
> pstack a process multiple times to get PC samples.
> gcore a process to get a core dump & take a look at the core dump with gdb.
> If you know that a specific account is likely to have this issue, enable
> IMAP protocol logging for the account. This may tell you something.
> On 05/08/2018 02:04 PM, Andrew Nichols via Info-cyrus wrote:
>> We've recently upgrade to cyrus 3.0.6 and we've been experiencing issue
>> where imapd will get stuck in an Append and take 100% CPU time.? The
>> client doesn't see anything weird.? I've tried leaving the processes to
>> see if they resolve, but they never did.? Doing a kill -TERM twice will
>> kill the process.?? I did not notice any issues with cyrus 3.0.5, but
>> we've been trying to get Xapian indexing going, so we moved to 3.0.6.
>> Does anyone have any information on where I can look to rectify this issue?
>> Here's an example of the ps output:
>> 87556 cyrus 1 102 0 163M 24192K CPU7 7 1:56 96.71% imapd: imap: <client
>> hostname> [<clientip>] <username> user.<username>.<folder1> Append (imapd)
>> 89680 cyrus 1 102 0 163M 24280K CPU4 4 3:10 96.22% imapd: imap: <client
>> hostname> [<clientip>] <username> user.<username>.<folder2> Append (imapd)
>> 43932 cyrus 1 102 0 161M 23416K CPU5 5 1:41 96.17% imapd: imap: <client
>> hostname> [<clientip>] <username> user.<username>.<folder3> Append (imapd)
>> 81590 cyrus 1 102 0 163M 24976K CPU6 6 3:14 94.76% imapd: imap: <client
>> hostname> [<clientip>] <username> user.<username>.<folder4> Append (imapd)
I was watching the processes with truss, and they weren’t making any system calls or creating any network traffic. The examples I gave were all from the same user, but we were having multiple users cause the same issue. I was able to get some core dumps using gcore and it looks to be an issue in the new EOF code in 3.0.6. I’d added a bug report on GitHub.
support at quadrant.net
More information about the Info-cyrus