replication fails after 2.3.9 -> 2.3.11

Paul Dekkers Paul.Dekkers at surfnet.nl
Sat Feb 23 16:53:19 EST 2008


Hi,

Simon Matter wrote:
>> Paul Dekkers wrote:
>>
>>> I finally found a moment for upgrading my 2.3.9 install (using Simon's
>>> RPMs on Red Hat 4.6, 64-bit) to 2.3.11-3 (leaving the config files
>>> untouched), after which it seems that replication isn't working properly
>>> anymore.
>> While it seems to be only replication for now that fails me; is it
>> possible to revert to the previous version? (While that implies for me
>> that I'll have to rpm -e and install the previous rpm, I suppose.)
>> Not sure if I'd like that, but I really really liked my replication
>> running.
> 
> You could do that with "rpm -Uvh --oldpackage ...".

Ah, thanks for that. I might do that if I can't get it to work soon,

(Judging from the changes/upgrade notes I guess nothing dramatically or
irreversibly changed in any of the databases/formats, I didn't touch the
GUID bits yet - so I guess I should be fine there downgrading.)

>>> If I run the sync_client, just a simple -u paul, I see in my logs:
>>>
>>> sync_client[18493]: SETMODSEQ received BAD response: Syntax error in
>>> Setflags: Invalid modseq
>>> sync_client[18493]: Error in do_user(paul): bailing out!
>>>
>>> Before the upgrade, I'm sure replication was working properly. I
>>> checked, both servers are really running the same versions of
>>> everything.
>>>
>>> I noticed that if I strace the sync_client, the folder on which it bails
>>> out is always the same. If I reconstruct that folder, and re-run (or
>>> just the mailbox), the process continues (up to the next folder that
>>> causes the thing to bail out - although it doesn't bail out on every
>>> folder).
>>>
>>> There were more strange log-items related to the sync_client;
>>>
>>> sync_client[18232]: USER: Invalid type 1 response from server
>>> sync_client[18232]: Discarding: 0000000000000000000000000000000000000000
>>> ()
>>> sync_client[18232]: Discarding: 2 0
>>> 0000000000000000000000000000000000000000 ()
>>> sync_client[18232]: Discarding: 3 0
>>> 0000000000000000000000000000000000000000 ()
>>> sync_client[18232]: Discarding: 4 0
>>> 0000000000000000000000000000000000000000 (\answered)
>>> sync_client[18232]: Discarding: 5 0
>>> 0000000000000000000000000000000000000000 ()
>>>
>>> and a bunch more, like:
>>>
>>> sync_client[18232]: Discarding: archief.thuispc
>>> ...
>>> sync_client[18232]: sync_eatlines_unsolicited(): resynchronised okay
>>> ...
>>> sync_client[18232]: Processing sync log file
>>> /data/config/imap/sync/log-18231 failed: Bad protocol
>>> sync_client[18231]: process 18232 exited, status 1
>>>
>>> Any clue why replication stopped working properly for me after the
>>> upgrade?
>> There is more sync-related uglyness in my logging; while I suppose this
>> is the most harmless one:
>>
>> sync_client[19532]: Hit upload limit 0 at UID 180958 for user.paul.Junk,
>> sending
>>
>> ... I don't recall seing it before. (And a limit of 0?!)
>>
>> What is worse, is that sync_client now also segfaults on the
>> rolling-log, as soon as I start a sync_client -v -r -f log,
>>
>> MAILBOXES user.henny user.henny.Email lists.IETF-announce
>> user.paul.Drafts archief.netmaster.spam user.elise
>> Segmentation fault
>>
>> And my kernel logs that as:
>>
>> sync_client[18881]: segfault at 0000000000000000 rip 0000002a96054a30
>> rsp 0000007fbfffda08 error 4
>>
>> ... unfortunately, the sync-log is only getting bigger, and I didn't
>> realize that running a sync_client -r -f log would take that much IO and
>> CPU (or that is something that changed in this version too).
>>
>> Somehow I'm not sure if running a reconstruct on all mailboxes is an
>> option, it would also take a huge amount of time. But somehow I don't
>> think it makes sense.
>>
>> I'll include my imapd.conf below, in case that is useful.
>>
>> Paul
>>
>> P.S. Hmm, and I intentionally skipped 2.3.10 as I believe that people
>> were having problems with that, and waited a bit with 2.3.11 :-S
> 
> I'm not using replication but IIRC there were some changes between 2.3.9
> and 2.3.11 which have to be addressed when using replication. Did you
> carefully check the upgrade instructions? Maybe there is something you
> have to do.

I did have a look at that; but I'm afraid there's nothing in there that
I missed; didn't touch the GUIDs (and your RPM leaves guid_mode default,
which is "off"), there are a couple of changes in replication that might
just be the cause of my problems, but it's not clearly related I'm
afraid. (Or at least there's nothing I didn't do that I should have done.)

I actually run replication with 2.3.11 on a different machine without
problems, but that's a small setup and on FreeBSD instead of Red Hat.
But I know what differences there are with the RPM, the manual is very
helpful with that, so I don't expect anything RPM-specific. (And there
was actually a fix for delayed delete in 2.3.11 in combination with
replication, so even if the invoca RPM has delayed delete by default
enabled I think it should work.)

> Another note: Be aware that the invoca rpm has some changed defaults for
> imapd.conf (which is stated in the manpage). Now, if one feature doesn't
> play nice with replication, this won't disturb other people who don't have
> those options enabled. Options that come to mind are:
> 
> delete_mode: delayed
> expunge_mode: delayed
> flushseenstate: 1

The delete_mode is indeed new; I didn't change the toggle there while I
did for expunge_mode, but now that I put it back to "immediate" it
doesn't help me either. (And I actually found that now I have a folder
"DELETED" that also got replicated (before it crashed again) ;-) but to
a different partition actually then on my master, surprisingly. Oh well.)

> Sorry if it doesn't really help.

Well, thanks for replying!
(The suggestion how to revert to the previous Cyrus is useful, and I'm
afraid I'll need it.)

Paul


> Simon
> 
>> My imapd.conf on the master:
>>
>> configdirectory: /data/config/imap
>> defaultpartition: imap4
>> partition-imap1: /data/imap1
>> partition-imap2: /data/imap2
>> partition-imap3: /data/imap3
>> partition-imap4: /data/imap4
>> sievedir: /data/config/sieve
>> hashimapspool: false
>>
>> md5_dir: /data/config/md5
>>
>> allowanonymouslogin: no
>> allowplaintext: yes
>> plaintextloginpause: 0
>> admins: cyrus
>> sasl_pwcheck_method: saslauthd
>> sasl_mech_list: PLAIN LOGIN
>> #sasl_pwcheck_method: auxprop
>>
>> duplicatesuppression: 1
>> quotawarn: 90
>> postuser: shared
>> lmtp_downcase_rcpt: yes
>> username_tolower: yes
>>
>> sieveuserhomedir: false
>> unix_group_enable: 1
>>
>> sync_host: ...
>> sync_authname: cyrus
>> sync_password: ...
>>
>> sync_machineid: 2
>> sync_log: true
>>
>> # default invoca-rpm db definitions on this machine!
>> ## explicit database definitions (from the past)
>> ##duplicate_db: skiplist
>> ## deliver.db: Berkeley DB (Btree, version 8, native byte-order)
>> #duplicate_db: berkeley
>> #mboxlist_db: skiplist
>> ## mailbox keys?
>> #mboxkey_db: skiplist
>> #seenstate_db: skiplist
>> #subscription_db: flat
>> ##tlscache_db: skiplist
>> ## tls_sessions.db: Berkeley DB (Btree, version 8, native byte-order)
>> #tlscache_db: berkeley
>> #annotation_db: skiplist
>> ##ptscache_db: skiplist
>> #ptscache_db: berkeley
>> #quota_db: quotalegacy
>>
>> # without this, I got errors in my test-setup using the dovecot imaptest
>> expunge_mode: immediate
>>
>> ----
>> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
>> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
>> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
>>
> 
> 



More information about the Info-cyrus mailing list