Replication errors in 2.3.16-8 (do_user bailing out)

Simpson, John R john_simpson at reyrey.com
Fri Nov 19 16:57:09 EST 2010


I've re-run the replication tests using the same mailstore data on a Cyrus 2.3.7 (CentOS/RHEL package) replica pair and the same type of errors occurred.  I then pulled another mailstore data from a Cyrus system in one of our QA environments (i.e. production, but non-customer data, as opposed to the lab data I've been using to this point) and tested it on the 2.3.16-8 master/replica pair.  The first sync_client run was promising -- there were a few "bailing" out errors and a few skipped users, but a much lower percentage that usual.  However, when I cleared the replica server and ran a second replication test there was a high rate of both types of issues.  As before, running sync_client multiple times eventually results in a fully synchronized replica.

John

John Simpson 
Senior Software Engineer, I. T. Engineering and Operations


> -----Original Message-----
> From: info-cyrus-bounces+john_simpson=reyrey.com at lists.andrew.cmu.edu [mailto:info-
> cyrus-bounces+john_simpson=reyrey.com at lists.andrew.cmu.edu] On Behalf Of Simpson,
> John R
> Sent: Friday, November 19, 2010 11:17 AM
> To: info-cyrus at lists.andrew.cmu.edu
> Subject: Replication errors in 2.3.16-8 (do_user bailing out)
> 
> I'm running into inconsistent errors from sync_client on a Cyrus 2.3.16-8 system.  It's
> driving me crazy because the same sync_client command will succeed at some times and
> fail at others.  I know others are running similar and much more complex systems
> successfully, so I must be missing something simple or fundamental.
> 
> Thanks in advance,
> 
> John
> 
> When the problem occurs, sync_client gives a "do_client(user at example.com): bailing
> out!" error and logs a message about failing to reset the replication (cyrusadmin) account.
> This occurs both when running sync_client against a single user and when using sync_client
> with a file containing 27,000 user names.  The failures do not always effect the same users,
> and attempts to replicate a given user generally succeed after 1-5 tries.  When running the
> 27,000 user test the failure can occur for anywhere from a handful to a few hundred users.
> The problem occurs whether sync_client is run by root, by root "su -c"' to cyrus, or as the
> cyrus user.  Running "reconstruct", "reconstruct -G", or "reconstruct -x -f" prior to
> running sync_client does not resolve the problem.
> 
> In addition to the "do_client(user at example.com): bailing out!" failures, users are
> sometimes skipped without a  sync_client error message -- they simply aren't replicated on
> that pass.  Sync_client displays the line "USER user at example.com" but none of the
> ADDSUB lines, similar to when processing a user that has never used their mailbox.
> 
> Both problems occur whether or not rolling replication is running.
> 
> Both the master mailstore (eml-store04) and the replica (eml-replica04) are CentOS 5.4
> 64-bit servers running as virtual machines on a single ESXi 4 server.  They are using the
> ext3 filesystem.  The cyrus-imapd and cyrus-imapd-utils RPMS were built from the Invoca
> RPMS on a similar system.  I enabled SNMP support, but didn't make any other changes to
> the spec file before running "rpmbuild -ba cyrus-imapd.spec".  No errors were reported
> during the build.
> 
> The data was transferred from a Cyrus 2.3.7 system using rsync on /var/lib/imap and
> /var/spool/imap, and appears normal from a Cyrus and IMAP client perspective.  Between
> tests, I clear the replica with "rm -rf /var/lib/imap/domain/* /var/lib/imap/sieve/*
> /var/spool/imap/domain/*".  Everything under /var/lib/imap and /var/spool/imap is owned
> by cyrus:mail.
> 
> [root at eml-store04 ~]# uname -a
> Linux eml-store04 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:54:20 EST 2010 x86_64
> x86_64 x86_64 GNU/Linux
> 
> [root at eml-replica04 reyrey.net]# rpm -qa | grep cyrus | sort
> cyrus-imapd-2.3.16-8
> cyrus-imapd-utils-2.3.16-8
> cyrus-sasl-2.1.22-5.el5_4.3
> cyrus-sasl-lib-2.1.22-5.el5_4.3
> cyrus-sasl-lib-2.1.22-5.el5_4.3
> cyrus-sasl-plain-2.1.22-5.el5_4.3
> cyrus-sasl-plain-2.1.22-5.el5_4.3
> 
> Failed attempt:
> [root at eml-store04 ~]# /usr/lib/cyrus-imapd/sync_client -l -v -u -f mailboxlist.ldap
> 
> eml-store04: output from sync_client:
> USER admin01 at reptest.org
> Error from do_user(admin01 at reptest.org): bailing out!
> 
> eml-store04: content of /var/log/maillog
> Nov 18 11:37:27 eml-store04 sync_client[6328]: USER admin01 at reptest.org
> Nov 18 11:37:27 eml-store04 sync_client[6328]: USER received NO response:
> IMAP_MAILBOX_NONEXISTENT Failed to access inbox for admin01 at reptest.org:
> System I/O error
> Nov 18 11:37:27 eml-store04 sync_client[6328]: RESET received NO response: Failed to
> reset account cyrusadmin: Internal Error
> Nov 18 11:37:27 eml-store04 sync_client[6328]: Error in do_user(admin01 at reptest.org):
> bailing out!
> 
> eml-replica04: content of /var/log/maillog
> Nov 18 11:37:27 eml-replica04 syncserver[12635]: IOERROR: opening
> /var/spool/imap/domain/r/reptest.org/a/user/admin01/cyrus.header: No such file or
> directory
> Nov 18 11:37:27 eml-replica04 syncserver[12635]: Failed to access inbox for
> admin01 at reptest.org
> Nov 18 11:37:27 eml-replica04 syncserver[12635]: IOERROR: opening
> /var/spool/imap/domain/r/reptest.org/a/user/admin01/cyrus.header: No such file or
> directory
> Nov 18 11:37:27 eml-replica04 syncserver[12635]: Unlocked
> 
> 
> Successful attempt:
> [root at eml-store04 ~]# /usr/lib/cyrus-imapd/sync_client -l -v -u admin01 at reptest.org
> eml-store04:output from sync_client:
> USER admin01 at reptest.org
> ADDSUB admin01 at reptest.org INBOX
> ADDSUB admin01 at reptest.org INBOX.Drafts
> ADDSUB admin01 at reptest.org INBOX.Sent Items
> ADDSUB admin01 at reptest.org INBOX.Trash
> 
> eml-store04: content of /var/log/maillog:
> Nov 18 11:54:47 eml-store04 sync_client[6416]: USER admin01 at reptest.org
> Nov 18 11:54:47 eml-store04 sync_client[6416]: USER received NO response:
> IMAP_MAILBOX_NONEXISTENT Failed to access inbox for admin01 at reptest.org:
> Mailbox does not exist
> Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admin01 at reptest.org INBOX
> Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admin01 at reptest.org
> INBOX.Drafts
> Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admin01 at reptest.org
> INBOX.Sent Items
> Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admin01 at reptest.org
> INBOX.Trash
> 
> eml-replica04: content of /var/log/maillog
> Nov 18 11:54:47 eml-replica04 syncserver[14261]: executed
> Nov 18 11:54:47 eml-replica04 syncserver[14261]: accepted connection
> Nov 18 11:54:47 eml-replica04 syncserver[14261]: cmdloop(): startup
> Nov 18 11:54:47 eml-replica04 syncserver[14261]: login: eml-store04.asddev.reyrey.com
> [172.29.97.2] cyrusadmin PLAIN User logged in
> Nov 18 11:54:47 eml-replica04 syncserver[14261]: Failed to access inbox for
> admin01 at reptest.org
> Nov 18 11:54:47 eml-replica04 syncserver[14261]: cannot unlink
> /var/lib/imap/domain/r/reptest.org/user/a/admin01.seen: No such file or directory
> Nov 18 11:54:47 eml-replica04 syncserver[14261]: cannot unlink
> /var/lib/imap/domain/r/reptest.org/user/a/admin01.mboxkey: No such file or directory
> Nov 18 11:54:48 eml-replica04 syncserver[14261]: Unlocked
> 
> [root at eml-store04 ~]# head mailboxlist.ldap
> Service.Administrator at portal.reyrey.net
> postmaster at reyrey.net
> test at reyrey.net
> test1 at reyrey.net
> bill at reyrey.net
> appproxy at reyrey.net
> abuse at reyrey.net
> adminlist at reyrey.net
> 
> John Simpson
> Senior Software Engineer, I. T. Engineering and Operations



More information about the Info-cyrus mailing list