sync_client problems
Eric Cunningham
eric at whoi.edu
Thu May 11 14:55:42 EDT 2017
Just to close the loop on this, once the corrupted cdm-lit account on my
copy server successfully reconstructed, the backlog of replication has
completed successfully and is now caught up.
-Eric
On 05/11/2017 02:20 PM, Eric Cunningham wrote:
> -G didn't seem to help, but I tried a "reconstruct -f -r" command
> without having previously deleted all occurrences of the cyrus.* files
> and it was then successful.
>
>
>
> On 05/11/2017 01:44 PM, Patrick Boutilier wrote:
>> Try a -G with reconstruct?
>>
>> -G Force re-parsing of the underlying message (checks GUID
>> correctness). Reconstruct with -G should fix all possible individual
>> message issues, including corrupted data files.
>>
>>
>> On 05/11/2017 02:37 PM, Eric Cunningham wrote:
>>> I have to walk this back. In looking slightly further back in my
>>> logfiles, before every instance of a failure to sync some folder, I
>>> see a common error reported prior to every "bailing out! Bad
>>> protocol" error - "IOERROR: GUID mismatch
>>> /var/spool/cyrus/mail/c/user/cdm-lit/Sent/148."
>>>
>>> May 11 12:44:09 imap1 sync_client[48590]: IOERROR: GUID mismatch
>>> /var/spool/cyrus/mail/c/user/cdm-lit/Sent/148.
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOX (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: do_folders(): update
>>> failed: user.cdm-lit.Sent 'Bad protocol'
>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOXES (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOXES (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOXES (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOXES (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOXES (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: sync_mailboxes: doing 1000
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOXES (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to MAILBOXES (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: IOERROR: zero length
>>> response to UNMAILBOX (end of file reached)
>>> May 11 12:44:22 imap1 sync_client[48590]: folder_delete(): failed:
>>> user.guest-student-coordinator.Trash.Aarflot 'Bad protocol'
>>> May 11 12:44:22 imap1 sync_client[48590]: Error in do_sync(): bailing
>>> out! Bad protocol
>>> May 11 12:44:22 imap1 sync_client[48590]: Processing sync log file
>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol
>>>
>>> When I recontructed this cdm-lit account, it ran successfully on my
>>> master host, but fails on my copy host:
>>>
>>> [cyrus at imap2 ~]$ reconstruct -f -r user.cdm-lit
>>> user.cdm-lit
>>> user.cdm-lit.Sea Trail 2013
>>> user.cdm-lit.Sent uid 1 found - adding
>>> user.cdm-lit.Sent uid 2 found - adding
>>> user.cdm-lit.Sent uid 3 found - adding
>>> user.cdm-lit.Sent uid 4 found - adding
>>> ....
>>> user.cdm-lit.Sent uid 146 found - adding
>>> user.cdm-lit.Sent uid 147 found - adding
>>> user.cdm-lit.Sent uid 148 found - adding
>>> fatal error: Internal error: assertion failed: imap/mailbox.c: 2847:
>>> record->size
>>>
>>> Since I couldn't get a successful reconstruct on this account, I
>>> deleted it and recreated it from my master host. However, I'm still
>>> unable to get a successful reconstruct with "failed to read index
>>> header" for every subfolder and "fatal error: Internal error:
>>> assertion failed: imap/mailbox.c: 2847: record->size"
>>>
>>> Any ideas on how to correct this so I can see if I can then get past
>>> the replication error?
>>>
>>> Thanks!
>>>
>>> -Eric
>>>
>>>
>>> On 05/11/2017 11:46 AM, Eric Cunningham wrote:
>>>> Thanks Bron, but that doesn't seem to work for me, unless I'm
>>>> missing something.
>>>>
>>>> I ran reconstructs for this account on both my master and copy hosts:
>>>>
>>>> [cyrus at imap1 ~]$ reconstruct -f -r user.scramer
>>>> user.scramer
>>>> user.scramer.Archive
>>>> user.scramer.Archives
>>>> user.scramer.Archives.2004
>>>> ...
>>>> user.scramer.Trash.IS Networking and Operations
>>>> user.scramer.Trash.IS Security
>>>> user.scramer.Trash.IS Servers
>>>> user.scramer.Trash.IS Staff
>>>> user.scramer.Trash.IS Surveys
>>>> ...
>>>>
>>>>
>>>> [cyrus at imap2 ~]$ reconstruct -f -r user.scramer
>>>> user.scramer
>>>> user.scramer.Archive
>>>> user.scramer.Archives
>>>> user.scramer.Archives.2004
>>>> ...
>>>> user.scramer.Trash.IS Networking and Operations
>>>> user.scramer.Trash.IS Security
>>>> user.scramer.Trash.IS Servers
>>>> user.scramer.Trash.IS Staff
>>>> user.scramer.Trash.IS Surveys
>>>> ...
>>>>
>>>>
>>>> I then restart the replication against the log-run file and it again
>>>> fails on a folder that no longer exists in that account:
>>>>
>>>> ...
>>>> May 11 11:40:51 imap1 sync_client[60696]: sync_mailboxes: doing 1000
>>>> May 11 11:40:51 imap1 sync_client[60696]: IOERROR: zero length
>>>> response to MAILBOXES (end of file reached)
>>>> May 11 11:40:51 imap1 sync_client[60696]: IOERROR: zero length
>>>> response to MAILBOXES (end of file reached)
>>>> May 11 11:40:51 imap1 sync_client[60696]: IOERROR: zero length
>>>> response to UNMAILBOX (end of file reached)
>>>> May 11 11:40:51 imap1 sync_client[60696]: folder_delete(): failed:
>>>> user.scramer.Trash.IS Software Management 'Bad protocol'
>>>> May 11 11:40:51 imap1 sync_client[60696]: Error in do_sync():
>>>> bailing out! Bad protocol
>>>> May 11 11:40:51 imap1 sync_client[60696]: Processing sync log file
>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol
>>>>
>>>>
>>>> -Eric
>>>>
>>>>
>>>> On 05/11/2017 02:14 AM, Bron Gondwana wrote:
>>>>> Looks like you have a corrupted mailboxes database - if you run a
>>>>> reconstruct at each end then it should be fine. Replication won't
>>>>> bail for an actually-doesn't-exist mailbox, but it can't fix broken
>>>>> mailboxes, that's what reconstruct is for.
>>>>>
>>>>> Bron.
>>>>>
>>>>>
>>>>> On Thu, 11 May 2017, at 01:01, Eric Cunningham wrote:
>>>>>> Hi list, I'm running cyrus replication on cyrus 2.5.10. It seems
>>>>>> whenever a folder is encountered in log-run that doesn't exist on the
>>>>>> client and/or the server, the replication process logs "Error in
>>>>>> do_sync(): bailing out! Bad protocol". Sometimes, it dies
>>>>>> completely,
>>>>>> other times it restarts at the beginning of the log-run file. But
>>>>>> when
>>>>>> it re-encounters the folder in question, the same failure occurs
>>>>>> and the
>>>>>> problem continuously loops until some manual action is taken. I
>>>>>> have a
>>>>>> checker script that attempts to restart a stopped sync but I haven't
>>>>>> tried to code it to get past these types of folder errors. In the
>>>>>> meantime, replication effectively stops and the replication log
>>>>>> continuously builds in size. You might imagine my horror at
>>>>>> discovering
>>>>>> yesterday that replication of my production server with 1200 accounts
>>>>>> had been failing for over 7 weeks unnoticed, generating 9.5 million
>>>>>> lines in the replication log files. I had gotten out of the habit of
>>>>>> daily monitoring of replication as it had been running fine and I
>>>>>> eventually moved on to other things. I'm trying to get
>>>>>> replication to
>>>>>> run through the enormous backlog but am running into more similar
>>>>>> folder
>>>>>> issues that cause the process to fail, I make manual fixes, and start
>>>>>> the process again. In my experience with this, reconstruction of the
>>>>>> affected folders on the master host and copy host has no effect.
>>>>>>
>>>>>> Can this otherwise excellent feature be made more robust to work past
>>>>>> these folder issues?
>>>>>>
>>>>>> Here are a couple of examples I've observed thus far of how the
>>>>>> process
>>>>>> is failing for me.
>>>>>>
>>>>>> --------------
>>>>>>
>>>>>> In this example, the subfolder "Fly America Foreign Travel" no longer
>>>>>> existed on the master host but did exist on the copy host. To
>>>>>> resolve
>>>>>> this, I had to manually delete that folder from the copy host.
>>>>>>
>>>>>> May 9 15:17:48 imap1 sync_client[34614]: inefficient replication
>>>>>> (8 >
>>>>>> 5) user.jtaft.PO.Fly America Foreign Travel
>>>>>> May 9 15:17:48 imap1 sync_client[34614]: FULLMAILBOX received NO
>>>>>> response: IMAP_MAILBOX_NONEXISTENT No Such Mailbox
>>>>>> May 9 15:17:48 imap1 sync_client[34614]: do_folders(): update
>>>>>> failed:
>>>>>> user.jtaft.PO.Fly America Foreign Travel 'Mailbox does not exist'
>>>>>> May 9 15:17:48 imap1 sync_client[34614]: IOERROR: Mailbox does
>>>>>> not exist
>>>>>> May 9 15:17:48 imap1 sync_client[34614]: Error in do_sync(): bailing
>>>>>> out! Mailbox does not exist
>>>>>> May 9 15:17:48 imap1 sync_client[34614]: Processing sync log file
>>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Mailbox does
>>>>>> not exist
>>>>>> May 9 15:17:48 imap1 sync_client[34614]: Reprocessing sync log file
>>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run
>>>>>>
>>>>>> --------------
>>>>>>
>>>>>> In this example, the subfolder "Sohn" did not exist on the master
>>>>>> host
>>>>>> nor on the copy host. To resolve, I had to edit log-run and
>>>>>> completely
>>>>>> remove all entries for this "Sohn" subfolder.
>>>>>>
>>>>>> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length
>>>>>> response
>>>>>> to MAILBOXES (end of file reached)
>>>>>> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length
>>>>>> response
>>>>>> to MAILBOXES (end of file reached)
>>>>>> May 10 10:13:26 imap1 sync_client[47011]: IOERROR: zero length
>>>>>> response
>>>>>> to UNMAILBOX (end of file reached)
>>>>>> May 10 10:13:26 imap1 sync_client[47011]: folder_delete(): failed:
>>>>>> user.gg-chair.Trash.Sohn 'Bad protocol'
>>>>>> May 10 10:13:26 imap1 sync_client[47011]: Error in do_sync(): bailing
>>>>>> out! Bad protocol
>>>>>> May 10 10:13:26 imap1 sync_client[47011]: Processing sync log file
>>>>>> /var/spool/CyrusDB-Sieve/cyrusdb/sync/log-run failed: Bad protocol
>>>>>>
>>>>>> --------------
>>>>>>
>>>>>> Often, errors are logged without "bailing out!" and replication
>>>>>> continues to run:
>>>>>>
>>>>>> May 10 10:45:00 imap1 sync_client[87438]: sync_mailboxes: doing 1000
>>>>>> May 10 10:45:01 imap1 sync_client[87438]: sync_mailboxes: doing 1000
>>>>>> May 10 10:45:24 imap1 sync_client[87438]: IOERROR: GUID mismatch
>>>>>> /var/spool/cyrus/mail/c/user/cdm-lit/Sent/148.
>>>>>> May 10 10:45:37 imap1 sync_client[87438]: IOERROR: zero length
>>>>>> response
>>>>>> to MAILBOX (end of file reached)
>>>>>> May 10 10:45:37 imap1 sync_client[87438]: do_folders(): update
>>>>>> failed:
>>>>>> user.cdm-lit.Sent 'Bad protocol'
>>>>>> May 10 10:45:37 imap1 sync_client[87438]: sync_mailboxes: doing 1000
>>>>>>
>>>>>> --------------
>>>>>>
>>>>>> Thank you, list!
>>>>>>
>>>>>> -Eric
>>>>>>
>>>>>> ----
>>>>>> Cyrus Home Page: http://www.cyrusimap.org/
>>>>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>>>>> To Unsubscribe:
>>>>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>>>>
>>>>>
>>>>
>>>
>>> ----
>>> Cyrus Home Page: http://www.cyrusimap.org/
>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>>> To Unsubscribe:
>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>
>>
>>
>> ----
>> Cyrus Home Page: http://www.cyrusimap.org/
>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
>> To Unsubscribe:
>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
>>
>
More information about the Info-cyrus
mailing list