cyr_expire deadlock
Andrew Morgan
morgan at orst.edu
Tue May 21 17:14:57 EDT 2013
On Tue, 21 May 2013, Łukasz Michalski wrote:
> Hi,
>
> I am running cyrus imapd 2.4.11 on linux machine.
>
> Today I had a deadlock involving cyr_expire and imapd process.
>
> imapd was locked on (strace):
>
> fcntl64(17, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}^C
> <unfinished ...>
>
> where fd=17 is a user index file (lsof):
>
> imapd 32314 cyrus 17u REG 8,3 30944 10462461
> /var/spool/imap/domain/c/cenbench.pl/a/user/arek^dydo/cyrus.index
>
> Unfortunetaly I did not check cyr_expire with strace, but lsof showed this:
>
> cyr_expir 24356 cyrus 0u CHR 1,3 0t0 509 /dev/null
> cyr_expir 24356 cyrus 1u CHR 1,3 0t0 509 /dev/null
> cyr_expir 24356 cyrus 2u CHR 1,3 0t0 509 /dev/null
> cyr_expir 24356 cyrus 3u CHR 1,3 0t0 509 /dev/null
> cyr_expir 24356 cyrus 4u CHR 1,3 0t0 509 /dev/null
> cyr_expir 24356 cyrus 5u REG 8,2 144 19196113
> /var/lib/imap/annotations.db
> cyr_expir 24356 cyrus 6u REG 8,2 13300 18911268
> /var/lib/imap/mailboxes.db
> cyr_expir 24356 cyrus 7r FIFO 0,5 0t0 5678136 pipe
> cyr_expir 24356 cyrus 8w FIFO 0,5 0t0 5678136 pipe
> cyr_expir 24356 cyrus 9r FIFO 0,5 0t0 5678137 pipe
> cyr_expir 24356 cyrus 10w FIFO 0,5 0t0 5678137 pipe
> cyr_expir 24356 cyrus 11u REG 8,2 171032 19196126
> /var/lib/imap/deliver.db
> cyr_expir 24356 cyrus 12uR REG 8,2 0 26961663
> /var/lib/imap/lock/domain/c/cenbench.pl/a/user/arek^dydo.lock
> cyr_expir 24356 cyrus 13u REG 8,3 30944 10462461
> /var/spool/imap/domain/c/cenbench.pl/a/user/arek^dydo/cyrus.index
>
> There was 50 imapd processes (my upper limit) in locked on the same file
> and a single cyr_expire. After killing cyr_expire I had to manually kill
> all imapd processes to allow master to spawn new ones.
>
> Not that my cyrus works on really, really slow machine. It is VM running
> under KVM with I/O access varying from 5 to 60MB/s (as shown by hdparm -t)
>
> Please let me know what can I do to trace it better next time.
>
> Regards,
> Łukasz
You probably will want to run reconstruct on that user's mailbox because
the cyrus.index file may be corrupted at this time. Also, you should
upgrade to Cyrus v2.4.17 if you can. There have been a large number of
bugfixes since your version, one of which may be the cause of your
deadlock.
Andy
More information about the Info-cyrus
mailing list