Making Replication Robust

Rob Mueller robm at fastmail.fm
Fri Oct 12 09:51:43 EDT 2007


> This would seem to be a significant advantage of running sync_client
> outside master.
>
> When I shut down master, sync_client continues to process the outstanding 
> log. I can then use sync_shutdown_file when it has finished and is idle.

We do something similar.

But it means you have to develop a bunch of your own infrastructure to make 
cyrus replication robust. It's not currently a "start it and it just works 
until you shut it down" solution, which means either people have to 
replicate the same extra infrastructure work everywhere separately, or 
people are going to get burnt not realising that what they're doing isn't 
safe.

That's why I'd really like this to be in cyrus itself. I think we should be 
able to say in the documentation something like:

"Shuting down a cyrus master with a SIGQUIT ensures that all actions have 
been replicated to the replica side".

It makes writing init scripts and the like a lot easier.

> There seems to be a spilt of opinion between BSD and SVR4: BSD tries to
> retry while SVR4 throws EINTR. Linux of course can work either way:
>
> http://www.gnu.org/software/libc/manual/html_node/Interrupted-Primitives.html

Isn't a lot of writes already wrapped up in some retry_write() function. I 
admit I haven't looked closely.

Anyway, is this really a problem. Basically shouldn't you be able to kill 
cyrus at any point, and files are left in a consistent restartable state? If 
so, if something returns EINTR, won't it just move on and eventually exit? 
Or is the problem that you have something like:

write to file 1
write to file 2

And if the first returns EINTR but is ignored, and then it writes the 
complete data to the second, things are in an inconsistent state?

Rob



More information about the Cyrus-devel mailing list