giving sync_client the ability to back off and retry locked mailboxes
Giles Malet
gdmalet at uwaterloo.ca
Wed Jun 29 10:20:58 EDT 2016
On Wed, 22 Jun 2016 12:25:28 +1000
ellie timoney via Cyrus-devel <cyrus-devel at lists.andrew.cmu.edu> wrote:
D) Don't add the new sync_action_list. If any operation returns
> IMAP_MAILBOX_LOCKED, just sync_log() that operation and continue, and
> let the next run deal with it.
I meant to comment on this a while ago, and your latest message just reminded me.
The current sync_client has an awful habit of just quitting when things go wrong ("Bailing out!"). This is not ideal for a system that is trying hard to keep the replica in sync. So we have a script that watches for this happening, and restarts it. A problem though with simply restarting is that whatever caused the bailing is still there, and it will happen again. So we move the old log out the way, do one more try on on that log, then discard it.
This way at least most stuff is kept in sync, and replication is still running. We might lose a small amount of changes, but that is preferable to losing a large amount of changes when the client dies.
This is where we are now. It's not ideal, but mostly works. Separately we have to notice that there was a problem and reconstruct or whatever, and perhaps sync the problematic client.
Anyhow, hopefully this is something to keep in mind with your latest changes: don't get stuck in a loop if something is corrupted, which does happen sometimes; & don't just quit and lose all changes!
Thanks for your work.
g
More information about the Cyrus-devel
mailing list