cyr_expire deadlock

Andrew Morgan morgan at orst.edu
Tue May 21 17:14:57 EDT 2013


On Tue, 21 May 2013, Łukasz Michalski wrote:

> Hi,
>
> I am running cyrus imapd 2.4.11 on linux machine.
>
> Today I had a deadlock involving cyr_expire and imapd process.
>
> imapd was locked on (strace):
>
> fcntl64(17, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}^C 
> <unfinished ...>
>
> where fd=17 is a user index file (lsof):
>
> imapd     32314    cyrus   17u      REG        8,3     30944   10462461 
> /var/spool/imap/domain/c/cenbench.pl/a/user/arek^dydo/cyrus.index
>
> Unfortunetaly I did not check cyr_expire with strace, but lsof showed this:
>
> cyr_expir 24356 cyrus    0u   CHR    1,3      0t0      509 /dev/null
> cyr_expir 24356 cyrus    1u   CHR    1,3      0t0      509 /dev/null
> cyr_expir 24356 cyrus    2u   CHR    1,3      0t0      509 /dev/null
> cyr_expir 24356 cyrus    3u   CHR    1,3      0t0      509 /dev/null
> cyr_expir 24356 cyrus    4u   CHR    1,3      0t0      509 /dev/null
> cyr_expir 24356 cyrus    5u   REG    8,2      144 19196113 
> /var/lib/imap/annotations.db
> cyr_expir 24356 cyrus    6u   REG    8,2    13300 18911268 
> /var/lib/imap/mailboxes.db
> cyr_expir 24356 cyrus    7r  FIFO    0,5      0t0  5678136 pipe
> cyr_expir 24356 cyrus    8w  FIFO    0,5      0t0  5678136 pipe
> cyr_expir 24356 cyrus    9r  FIFO    0,5      0t0  5678137 pipe
> cyr_expir 24356 cyrus   10w  FIFO    0,5      0t0  5678137 pipe
> cyr_expir 24356 cyrus   11u   REG    8,2   171032 19196126 
> /var/lib/imap/deliver.db
> cyr_expir 24356 cyrus   12uR  REG    8,2        0 26961663 
> /var/lib/imap/lock/domain/c/cenbench.pl/a/user/arek^dydo.lock
> cyr_expir 24356 cyrus   13u   REG    8,3    30944 10462461 
> /var/spool/imap/domain/c/cenbench.pl/a/user/arek^dydo/cyrus.index
>
> There was 50 imapd processes (my upper limit) in locked on the same file 
> and a single cyr_expire. After killing cyr_expire I had to manually kill 
> all imapd processes to allow master to spawn new ones.
>
> Not that my cyrus works on really, really slow machine. It is VM running 
> under KVM with I/O access varying from 5 to 60MB/s (as shown by hdparm -t)
>
> Please let me know what can I do to trace it better next time.
>
> Regards,
> Łukasz

You probably will want to run reconstruct on that user's mailbox because 
the cyrus.index file may be corrupted at this time.  Also, you should 
upgrade to Cyrus v2.4.17 if you can.  There have been a large number of 
bugfixes since your version, one of which may be the cause of your 
deadlock.

 	Andy


More information about the Info-cyrus mailing list