sync_client problems

Eric Cunningham eric at whoi.edu
Fri May 12 09:09:18 EDT 2017


I have snapshot backups of that account, that particular message and the 
various cyrus db files.  What can I provide for review?


On 5/11/17 7:38 PM, Bron Gondwana wrote:
> It looked like the '148.' file was corrupted in some way.  I assume you've deleted all the evidence now, so we can't see how!  Oh well :(
>
> Bron.
>
> On Fri, 12 May 2017, at 04:55, Eric Cunningham wrote:
>> Just to close the loop on this, once the corrupted cdm-lit account on my
>> copy server successfully reconstructed, the backlog of replication has
>> completed successfully and is now caught up.
>>
>> -Eric
>>
>> On 05/11/2017 02:20 PM, Eric Cunningham wrote:
>>> -G didn't seem to help, but I tried a "reconstruct -f -r" command
>>> without having previously deleted all occurrences of the cyrus.* files
>>> and it was then successful.
>>>
>>>
>>>
>>> On 05/11/2017 01:44 PM, Patrick Boutilier wrote:
>>>> Try a -G with reconstruct?
>>>>
>>>> -G     Force  re-parsing  of the underlying message (checks GUID
>>>> correctness).  Reconstruct with -G should fix all possible individual
>>>> message issues, including corrupted data files.
>>>>
>>>>
>>>> On 05/11/2017 02:37 PM, Eric Cunningham wrote:
>>>>> I have to walk this back.  In looking slightly further back in my
>>>>> logfiles, before every instance of a failure to sync some folder, I
>>>>> see a common error reported prior to every "bailing out! Bad
>>>>> protocol" error - "IOERROR: GUID mismatch
>>>>> /var/spool/cyrus/mail/c/user/cdm-lit/Sent/148."
>>>>>
>>>>> May 11 12:44:09 imap1 sync_client[48590]: IOERROR: GUID mismatch
>>>>> /var/spool/cyrus/mail/c/user/cdm-lit/Sent/148.
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOX (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: do_folders(): update
>>>>> failed: user.cdm-lit.Sent 'Bad protocol'
>>>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOXES (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOXES (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOXES (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOXES (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOXES (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOXES (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to MAILBOXES (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>>>> response to UNMAILBOX (end of file reached)
>>>>> May 11 12:44:22 imap1 sync_client[48590]: folder_delete(): failed:
>>>>> user.guest-student-coordinator.Trash.Aarflot 'Bad protocol'
>>>>> May 11 12:44:22 imap1 sync_client[48590]: Error in do_sync(): bailing
>>>>> out! Bad protocol
>>>>> May 11 12:44:22 imap1 sync_client[48590]: Processing sync log file
>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol
>>>>>
>>>>> When I recontructed this cdm-lit account, it ran successfully on my
>>>>> master host, but fails on my copy host:
>>>>>
>>>>> [cyrus at imap2 ~]$ reconstruct -f -r user.cdm-lit
>>>>> user.cdm-lit
>>>>> user.cdm-lit.Sea Trail 2013
>>>>> user.cdm-lit.Sent uid 1 found - adding
>>>>> user.cdm-lit.Sent uid 2 found - adding
>>>>> user.cdm-lit.Sent uid 3 found - adding
>>>>> user.cdm-lit.Sent uid 4 found - adding
>>>>> ....
>>>>> user.cdm-lit.Sent uid 146 found - adding
>>>>> user.cdm-lit.Sent uid 147 found - adding
>>>>> user.cdm-lit.Sent uid 148 found - adding
>>>>> fatal error: Internal error: assertion failed: imap/mailbox.c: 2847:
>>>>> record->size
>>>>>
>>>>> Since I couldn't get a successful reconstruct on this account, I
>>>>> deleted it and recreated it from my master host.  However, I'm still
>>>>> unable to get a successful reconstruct with "failed to read index
>>>>> header" for every subfolder and "fatal error: Internal error:
>>>>> assertion failed: imap/mailbox.c: 2847: record->size"
>>>>>
>>>>> Any ideas on how to correct this so I can see if I can then get past
>>>>> the replication error?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> -Eric
>>>>>
>>>>>
>>>>> On 05/11/2017 11:46 AM, Eric Cunningham wrote:
>>>>>> Thanks Bron, but that doesn't seem to work for me, unless I'm
>>>>>> missing something.
>>>>>>
>>>>>> I ran reconstructs for this account on both my master and copy hosts:
>>>>>>
>>>>>> [cyrus at imap1 ~]$ reconstruct -f -r user.scramer
>>>>>> user.scramer
>>>>>> user.scramer.Archive
>>>>>> user.scramer.Archives
>>>>>> user.scramer.Archives.2004
>>>>>> ...
>>>>>> user.scramer.Trash.IS Networking and Operations
>>>>>> user.scramer.Trash.IS Security
>>>>>> user.scramer.Trash.IS Servers
>>>>>> user.scramer.Trash.IS Staff
>>>>>> user.scramer.Trash.IS Surveys
>>>>>> ...
>>>>>>
>>>>>>
>>>>>> [cyrus at imap2 ~]$ reconstruct -f -r user.scramer
>>>>>> user.scramer
>>>>>> user.scramer.Archive
>>>>>> user.scramer.Archives
>>>>>> user.scramer.Archives.2004
>>>>>> ...
>>>>>> user.scramer.Trash.IS Networking and Operations
>>>>>> user.scramer.Trash.IS Security
>>>>>> user.scramer.Trash.IS Servers
>>>>>> user.scramer.Trash.IS Staff
>>>>>> user.scramer.Trash.IS Surveys
>>>>>> ...
>>>>>>
>>>>>>
>>>>>> I then restart the replication against the log-run file and it again
>>>>>> fails on a folder that no longer exists in that account:
>>>>>>
>>>>>> ...
>>>>>> May 11 11:40:51 imap1 sync_client[60696]: sync_mailboxes: doing 1000
>>>>>> May 11 11:40:51 imap1 sync_client[60696]: IOERROR: zero length
>>>>>> response to MAILBOXES (end of file reached)
>>>>>> May 11 11:40:51 imap1 sync_client[60696]: IOERROR: zero length
>>>>>> response to MAILBOXES (end of file reached)
>>>>>> May 11 11:40:51 imap1 sync_client[60696]: IOERROR: zero length
>>>>>> response to UNMAILBOX (end of file reached)
>>>>>> May 11 11:40:51 imap1 sync_client[60696]: folder_delete(): failed:
>>>>>> user.scramer.Trash.IS Software Management 'Bad protocol'
>>>>>> May 11 11:40:51 imap1 sync_client[60696]: Error in do_sync():
>>>>>> bailing out! Bad protocol
>>>>>> May 11 11:40:51 imap1 sync_client[60696]: Processing sync log file
>>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol
>>>>>>
>>>>>>
>>>>>> -Eric
>>>>>>
>>>>>>
>>>>>> On 05/11/2017 02:14 AM, Bron Gondwana wrote:
>>>>>>> Looks like you have a corrupted mailboxes database - if you run a
>>>>>>> reconstruct at each end then it should be fine.  Replication won't
>>>>>>> bail for an actually-doesn't-exist mailbox, but it can't fix broken
>>>>>>> mailboxes, that's what reconstruct is for.
>>>>>>>
>>>>>>> Bron.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, 11 May 2017, at 01:01, Eric Cunningham wrote:
>>>>>>>> Hi list, I'm running cyrus replication on cyrus 2.5.10.  It seems
>>>>>>>> whenever a folder is encountered in log-run that doesn't exist on the
>>>>>>>> client and/or the server, the replication process logs "Error in
>>>>>>>> do_sync(): bailing out! Bad protocol".  Sometimes, it dies
>>>>>>>> completely,
>>>>>>>> other times it restarts at the beginning of the log-run file.  But
>>>>>>>> when
>>>>>>>> it re-encounters the folder in question, the same failure occurs
>>>>>>>> and the
>>>>>>>> problem continuously loops until some manual action is taken.  I
>>>>>>>> have a
>>>>>>>> checker script that attempts to restart a stopped sync but I haven't
>>>>>>>> tried to code it to get past these types of folder errors.  In the
>>>>>>>> meantime, replication effectively stops and the replication log
>>>>>>>> continuously builds in size.  You might imagine my horror at
>>>>>>>> discovering
>>>>>>>> yesterday that replication of my production server with 1200 accounts
>>>>>>>> had been failing for over 7 weeks unnoticed, generating 9.5 million
>>>>>>>> lines in the replication log files.  I had gotten out of the habit of
>>>>>>>> daily monitoring of replication as it had been running fine and I
>>>>>>>> eventually moved on to other things.  I'm trying to get
>>>>>>>> replication to
>>>>>>>> run through the enormous backlog but am running into more similar
>>>>>>>> folder
>>>>>>>> issues that cause the process to fail, I make manual fixes, and start
>>>>>>>> the process again.  In my experience with this, reconstruction of the
>>>>>>>> affected folders on the master host and copy host has no effect.
>>>>>>>>
>>>>>>>> Can this otherwise excellent feature be made more robust to work past
>>>>>>>> these folder issues?
>>>>>>>>
>>>>>>>> Here are a couple of examples I've observed thus far of how the
>>>>>>>> process
>>>>>>>> is failing for me.
>>>>>>>>
>>>>>>>> --------------
>>>>>>>>
>>>>>>>> In this example, the subfolder "Fly America Foreign Travel" no longer
>>>>>>>> existed on the master host but did exist on the copy host.  To
>>>>>>>> resolve
>>>>>>>> this, I had to manually delete that folder from the copy host.
>>>>>>>>
>>>>>>>> May  9 15:17:48 imap1 sync_client[34614]: inefficient replication
>>>>>>>> (8 >
>>>>>>>> 5) user.jtaft.PO.Fly America Foreign Travel
>>>>>>>> May  9 15:17:48 imap1 sync_client[34614]: FULLMAILBOX received NO
>>>>>>>> response: IMAP_MAILBOX_NONEXISTENT No Such Mailbox
>>>>>>>> May  9 15:17:48 imap1 sync_client[34614]: do_folders(): update
>>>>>>>> failed:
>>>>>>>> user.jtaft.PO.Fly America Foreign Travel 'Mailbox does not exist'
>>>>>>>> May  9 15:17:48 imap1 sync_client[34614]: IOERROR: Mailbox does
>>>>>>>> not exist
>>>>>>>> May  9 15:17:48 imap1 sync_client[34614]: Error in do_sync(): bailing
>>>>>>>> out! Mailbox does not exist
>>>>>>>> May  9 15:17:48 imap1 sync_client[34614]: Processing sync log file
>>>>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Mailbox does
>>>>>>>> not exist
>>>>>>>> May  9 15:17:48 imap1 sync_client[34614]: Reprocessing sync log file
>>>>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run
>>>>>>>>
>>>>>>>> --------------
>>>>>>>>
>>>>>>>> In this example, the subfolder "Sohn" did not exist on the master
>>>>>>>> host
>>>>>>>> nor on the copy host.  To resolve, I had to edit log-run and
>>>>>>>> completely
>>>>>>>> remove all entries for this "Sohn" subfolder.
>>>>>>>>
>>>>>>>> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length
>>>>>>>> response
>>>>>>>> to MAILBOXES (end of file reached)
>>>>>>>> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length
>>>>>>>> response
>>>>>>>> to MAILBOXES (end of file reached)
>>>>>>>> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length
>>>>>>>> response
>>>>>>>> to UNMAILBOX (end of file reached)
>>>>>>>> May 10 10:13:26 imap1 sync_client[47011]: folder_delete(): failed:
>>>>>>>> user.gg-chair.Trash.Sohn 'Bad protocol'
>>>>>>>> May 10 10:13:26 imap1 sync_client[47011]: Error in do_sync(): bailing
>>>>>>>> out! Bad protocol
>>>>>>>> May 10 10:13:26 imap1 sync_client[47011]: Processing sync log file
>>>>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol
>>>>>>>>
>>>>>>>> --------------
>>>>>>>>
>>>>>>>> Often, errors are logged without "bailing out!" and replication
>>>>>>>> continues to run:
>>>>>>>>
>>>>>>>> May 10 10:45:00 imap1 sync_client[87438]: sync_mailboxes: doing 1000
>>>>>>>> May 10 10:45:01 imap1 sync_client[87438]: sync_mailboxes: doing 1000
>>>>>>>> May 10 10:45:24 imap1 sync_client[87438]: IOERROR: GUID mismatch
>>>>>>>> /var/spool/cyrus/mail/c/user/cdm-lit/Sent/148.
>>>>>>>> May 10 10:45:37 imap1 sync_client[87438]: IOERROR: zero length
>>>>>>>> response
>>>>>>>> to MAILBOX (end of file reached)
>>>>>>>> May 10 10:45:37 imap1 sync_client[87438]: do_folders(): update
>>>>>>>> failed:
>>>>>>>> user.cdm-lit.Sent 'Bad protocol'
>>>>>>>> May 10 10:45:37 imap1 sync_client[87438]: sync_mailboxes: doing 1000
>>>>>>>>
>>>>>>>> --------------
>>>>>>>>
>>>>>>>> Thank you, list!
>>>>>>>>
>>>>>>>> -Eric
>>>>>>>>
>>>>>>>> ----
>>>>>>>> Cyrus Home Page: http://www.cyrusimap.org/
>>>>>>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>>>>>>> To Unsubscribe:
>>>>>>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> ----
>>>>> Cyrus Home Page: http://www.cyrusimap.org/
>>>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>>>> To Unsubscribe:
>>>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>>>
>>>>
>>>>
>>>> ----
>>>> Cyrus Home Page: http://www.cyrusimap.org/
>>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>>> To Unsubscribe:
>>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>>>
>>>
>>
>> ----
>> Cyrus Home Page: http://www.cyrusimap.org/
>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>> To Unsubscribe:
>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>
>



More information about the Info-cyrus mailing list