do_folders() and do_sync() results in "bad protocol" after a while

Ivan Lezhnjov Jr. ivan.lezhnjov.jr at gmail.com
Mon Jul 25 05:37:47 EDT 2011


Hi everybody,

I've noticed a weird behavior in a two-host setup. Consider the
following situation. I set up two hosts to act as Master and Replica.
I sent a message to Master that pushed it to Replica. I sent two more
messages to Master, and they were synced just fine too. Effectively, I
can continue to send messages in a similar fashion as long as I want
to. But what I did next was swapped the roles of the two hosts so that
Master became Replica and vice versa. I continued to send messages to
a new Master then, and it pushed the changes to a new Replica.
Everything seems to work fine. Or does it?

I decided to make a break and left the two boxes to idle. When I
returned I sent one more message to the new Master and this is what I
saw in logs:

======================== Master
Jul 22 13:24:19 imap2-replica sendmail[3179]: p6MAOJvn003177:
to=<zxy at localhost.localdomain>, delay=00:00:00, xdelay=00:00:00, mail
er=cyrusv2, pri=120338, relay=localhost, dsn=2.0.0, stat=Sent
Jul 22 13:24:20 imap2-replica sync_client[3157]: MAILBOX received NO
response: IMAP_MAILBOX_CRC Checksum Failure
Jul 22 13:24:20 imap2-replica sync_client[3157]: CRC failure on sync
for user.zxy, trying full update
Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: record
mismatch with replica: user.zxy more recent on replica
Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: master
uid:7 modseq:10 last_updated:1311327073 internaldate:1311327073
 flags:(NonJunk)
Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: replica
uid:7 modseq:11 last_updated:1311327590 internaldate:131132707
3 flags:(NonJunk)
Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: record
mismatch with replica: user.zxy more recent on replica
Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: master
uid:7 modseq:10 last_updated:1311327073 internaldate:1311327073
 flags:(NonJunk)
Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: replica
uid:7 modseq:11 last_updated:1311327590 internaldate:131132707
3 flags:(NonJunk)
Jul 22 13:24:20 imap2-replica sync_client[3157]: user.zxy: same
message appears twice 8 9
Jul 22 13:24:20 imap2-replica sync_client[3157]: Unlinking files in
mailbox user.zxy
Jul 22 13:24:20 imap2-replica sync_client[3157]: do_folders(): update
failed: user.zxy 'Bad protocol'
Jul 22 13:24:20 imap2-replica sync_client[3157]: Error in do_sync():
bailing out! Bad protocol
Jul 22 13:24:20 imap2-replica sync_client[3157]: Processing sync log
file /var/lib/imap/sync/log-3157 failed: Bad protocol

======================== Replica
Jul 22 15:40:06 imap2-master syncserver[3782]: executed
Jul 22 15:40:06 imap2-master syncserver[3782]: accepted connection
Jul 22 15:40:06 imap2-master syncserver[3782]: cmdloop(): startup
Jul 22 15:40:06 imap2-master syncserver[3782]: login:
replica.localdomain [10.10.0.129] cyrus PLAIN User logged in
Jul 22 15:40:06 imap2-master syncserver[3782]: IOERROR: failed to
parse /var/spool/imap/sync./3782/5906c967e91e2d9ffec225fdc502ba8d
0d8739cb
Jul 22 15:40:06 imap2-master syncserver[3782]: IOERROR: failed to
append file user.zxy 4


I was able to reproduce the described scenario 4 times in a row. Each
time these errors crop up ONLY after a long enough pause (say 30
minutes) in between the acts of sending messages to the Master host. I
wrestled with this issue throughout the entire Friday last week only
to see no changes at all. So, I had to put this on hold for the
week-end time and resolved to get back to the issue next week, which I
did today.

So, I left the boxes to idle last Friday evening, and when I came this
morning to the office and checked the logs, they didn't have the above
mentioned errors at all. Almost no sync_* related messages at all, in
fact. OK, so I tried to send another new message to the host
configured currently to act as a Master, and surprisingly it was
delivered to the mailbox as well as pushed to the Replica host. I sent
some more messages and they were delivered too, as well as
successfully synced up. So, I left the boxes to idle again for some
30+ minutes to attempt to reproduce the error messages, and sent one
more message after 30+ minutes. This time I've seen a similar error
message:

======================== Master
ler=cyrusv2, pri=120406, relay=localhost, dsn=2.0.0, stat=Sent
Jul 25 12:25:24 imap2-replica sync_client[17397]: MAILBOX received NO
response: IMAP_MAILBOX_CRC Checksum Failure
Jul 25 12:25:24 imap2-replica sync_client[17397]: CRC failure on sync
for user.zxy, trying full update
Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: record
mismatch with replica: user.zxy more recent on replica
Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: master
uid:8 modseq:11 last_updated:1311583915 internaldate:131158391
5 flags:(NonJunk)
Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: replica
uid:8 modseq:12 last_updated:1311584194 internaldate:13115839
15 flags:(NonJunk)
Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: record
mismatch with replica: user.zxy more recent on replica
Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: master
uid:8 modseq:11 last_updated:1311583915 internaldate:131158391
5 flags:(NonJunk)
Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: replica
uid:8 modseq:12 last_updated:1311584194 internaldate:13115839
15 flags:(NonJunk)
Jul 25 12:25:24 imap2-replica sync_client[17397]: user.zxy: same
message appears twice 9 10
Jul 25 12:25:24 imap2-replica sync_client[17397]: Unlinking files in
mailbox user.zxy
Jul 25 12:25:24 imap2-replica sync_client[17397]: do_folders(): update
failed: user.zxy 'Bad protocol'
Jul 25 12:25:24 imap2-replica sync_client[17397]: Error in do_sync():
bailing out! Bad protocol
Jul 25 12:25:24 imap2-replica sync_client[17397]: Processing sync log
file /var/lib/imap/sync/log-17397 failed: Bad protocol

======================== Replica
Jul 25 12:25:20 imap2-master syncserver[16955]: accepted connection
Jul 25 12:25:20 imap2-master syncserver[16955]: cmdloop(): startup
Jul 25 12:25:20 imap2-master syncserver[16955]: login:
replica.localdomain [10.10.0.129] cyrus PLAIN User logged in
Jul 25 12:25:24 imap2-master syncserver[16955]: IOERROR: failed to
parse /var/spool/imap/sync./16955/99c7b97f08d7dc94a5335bf6c32719
2e6185e4e6
Jul 25 12:25:24 imap2-master syncserver[16955]: IOERROR: failed to
append file user.zxy 7

Now, obviously nothing has been changed in the configuration of the
two boxes. They just stood there idling for two days and then they've
started to work again.  Only for a short time.

Can anyone please explain what is going here? Why does it seem
cyrus-imap needs some time to somehow fix things on its own?

This is my configuration:

======================== Master, cyrus.conf

START {
  recover       cmd="ctl_cyrusdb -r"

  idled         cmd="idled"

  syncclient    cmd="/usr/lib/cyrus-imapd/sync_client -r"

}

SERVICES {
  imap          cmd="imapd" listen="imap" prefork=5
  imaps         cmd="imapd -s" listen="imaps" prefork=1


  lmtpunix      cmd="lmtpd" listen="/var/lib/imap/socket/lmtp" prefork=1

}

EVENTS {
  checkpoint    cmd="ctl_cyrusdb -c" period=30

  delprune      cmd="cyr_expire -E 3" at=0400

  tlsprune      cmd="tls_prune" at=0400
}

======================== Master, imapd.conf
configdirectory: /var/lib/imap
partition-default: /var/spool/imap
admins: cyrus
sievedir: /var/lib/imap/sieve
sendmail: /usr/sbin/sendmail
hashimapspool: true
sasl_pwcheck_method: saslauthd
sasl_mech_list: PLAIN
tls_cert_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem
tls_key_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem
tls_ca_file: /etc/pki/tls/certs/cacert.pem
# uncomment this if you're operating in a DSCP environment (RFC-4594)
# qosmarking: af13
sync_host: master.localdomain
sync_authname: cyrus
sync_password: xxxxxxxxxxxxxxx
sync_log: 1
allowplaintext: 1
tcp_keepalive: 1
guid_mode: sha1
delete_mode: immediate

======================== Replica, cyrus.conf
START {
  recover       cmd="ctl_cyrusdb -r"

  idled         cmd="idled"
}

SERVICES {
  imap          cmd="imapd" listen="imap" prefork=5
  imaps         cmd="imapd -s" listen="imaps" prefork=1


  lmtpunix      cmd="lmtpd" listen="/var/lib/imap/socket/lmtp" prefork=1


  syncserver       cmd="/usr/lib/cyrus-imapd/sync_server" listen="csync"
}

EVENTS {
  checkpoint    cmd="ctl_cyrusdb -c" period=30

  delprune      cmd="cyr_expire -E 3" at=0400

  tlsprune      cmd="tls_prune" at=0400
}

======================== Replica, imapd.conf
configdirectory: /var/lib/imap
partition-default: /var/spool/imap
admins: cyrus
sievedir: /var/lib/imap/sieve
sendmail: /usr/sbin/sendmail
hashimapspool: true
sasl_pwcheck_method: saslauthd
sasl_mech_list: PLAIN
tls_cert_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem
tls_key_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem
tls_ca_file: /etc/pki/tls/certs/cacert.pem
# uncomment this if you're operating in a DSCP environment (RFC-4594)
# qosmarking: af13
allowplaintext: 1
tcp_keepalive: 1
guid_mode: sha1
delete_mode: immediate


======================== Software

CentOS 5 Final on x86_64

[root at clone-machine ~]# yum list installed |grep cyrus
cyrus-imapd.x86_64                       2.4.10-1               installed
cyrus-imapd-debuginfo.x86_64             2.4.10-1               installed
cyrus-imapd-devel.x86_64                 2.4.10-1               installed
cyrus-imapd-perl.x86_64                  2.4.10-1               installed
cyrus-imapd-utils.x86_64                 2.4.10-1               installed
cyrus-sasl.x86_64                        2.1.22-4               installed
cyrus-sasl-devel.x86_64                  2.1.22-4               installed
cyrus-sasl-lib.x86_64                    2.1.22-4               installed
cyrus-sasl-plain.x86_64                  2.1.22-4               installed

Kolab build


More information about the Info-cyrus mailing list