sync_client errors out after 2.3.16 -> 2.5.9 upgrade

ellie timoney ellie at fastmail.com
Sun Aug 14 19:34:02 EDT 2016



On Sat, Aug 13, 2016, at 03:12 AM, Kenneth Marshall wrote:
> On Wed, Aug 10, 2016 at 12:30:03PM +1000, ellie timoney wrote:
> > On Tue, Aug 2, 2016, at 02:14 PM, Kenneth Marshall via Info-cyrus wrote:
> > > On Tue, Aug 02, 2016 at 12:34:13PM +1000, Bron Gondwana wrote:
> > > > 2.3.2 wasn't version 10, it was version 7!  It would have upgraded through 8, 9, 10 - and maybe you needed to reconstruct to get GUIDs for those versions - that sounds familiar.
> > > > 
> > > > Bron.
> > > > 
> > > 
> > > Hi Bron,
> > > 
> > > Is there a way to identify mailboxes with no GUID? Then I could target
> > > them
> > > for reconstruction first.
> > > 
> > > Regards,
> > > Ken
> > 
> > Hi Ken,
> > 
> > I don't suppose you still have mailboxes with no GUID?  Or have you
> > already found and "reconstruct -G"ed everything?
> > 
> > I've attached a patch that does two things, which are kind of the same
> > thing:
> > 
> > * If the mailbox on either end of the replication is of a version < 10,
> > then the operation will fail cleanly and early with an "Operation is not
> > supported on mailbox" error (rather than trying to do the replication,
> > then crashing out like it currently does) -- though I don't believe this
> > case will affect you, as I believe your mailboxes are all at least
> > version 10.
> > 
> > * If the mailbox on either end of the replication contains any index
> > records that do not have a GUID set, then the same will occur, and (in
> > addition to the above error) a warning will be logged to syslog like:
> >     <mailbox>: missing guid for record <number> -- needs 'reconstruct
> >     -G'?
> > 
> > So with this patch, you should be able to avoid these crashes, and also
> > identify mailboxes that need "reconstruct -G" just by checking for the
> > warning in syslog.
> > 
> > Are you able to try this out?
> > 
> > Cheers,
> > 
> > ellie
> 
> 
> Hi Ellie,
> 
> I tried the patch and it did print the warnings on the replica.
> Unfortunately,
> it looked like it was looping on the master (sync_client) and the updates
> that
> could be made were not. I had to rollback to the dying version.
> 
> Regards,
> Ken

What do you mean by "looping"?  sync_client in rolling mode (-r option)
will keep reprocessing the same sync log (from the top) until it
finishes it successfully, which in this case if it contains operations
on messages that aren't replicable, will be the case.  Even on a failed
run, mailboxes that did get replicated will have been replicated -- they
just won't be removed from the sync log, so the next time it tries they
will be replicated again (which does nothing if they are already up to
date from having just been replicated).  This is how it's supposed to
work.

It wouldn't repeat like this if it were crashing out, obviously -- but
in rolling mode, that would be a problem, because it's supposed to keep
going and try again, not crash out.

If you're not using rolling mode, or if the looping behaviour you
describe is something other than this, are you able to provide logs of
what it's doing?  If you run it with '-R' instead of '-r', it will be
the same as rolling mode but it won't put itself in the background, then
you can use the -v option several times to increase the verbosity of its
output, and shut it down easily with ^C.

Obviously there's privacy concerns with sending sync_client output --
even at its lowest verbosity it exposes users' mailbox names, and at
higher verbosities also exposes message content, flags, and everything
else, so don't send this output directly.  But maybe if you could look
over it and summarise it?

Cheers,

ellie


More information about the Info-cyrus mailing list