Making Replication Robust

David Carter dpc22 at cam.ac.uk
Fri Oct 12 05:55:04 EDT 2007


On Wed, 10 Oct 2007, Rob Mueller wrote:

> I think the problem at the moment is that the process you really want is:
>
> 1. Stop new imap/pop/lmtp/sieve/etc connections
> 2. Finish and close existing connections cleanly but as quickly as possible
> 3. Finish running any sync log files
> 4. Fully shutdown
>
> There's currently no clean way to do this. Basically you have to SIGTERM 
> master which hard kills it and all children, then manually run 
> sync_client -f on any remaining log files.

This would seem to be a significant advantage of running sync_client 
outside master.

When I shut down master, sync_client continues to process the outstanding 
log. I can then use sync_shutdown_file when it has finished and is idle.

sync_client could catch SIGQUIT to initiate some form of clean shutdown.

I'm still a little bothered about signal handling and EINTR. I did some 
experiments after our last chat about signals. In practice disk IO system 
calls seem to be reasonably safe against EINTR on both Linux and Solaris, 
but a trip to Google suggests that there are few guarantees:

http://archives.postgresql.org/pgsql-hackers/2005-12/msg01259.php

There seems to be a spilt of opinion between BSD and SVR4: BSD tries to
retry while SVR4 throws EINTR. Linux of course can work either way:

http://www.gnu.org/software/libc/manual/html_node/Interrupted-Primitives.html

-- 
David Carter                             Email: David.Carter at ucs.cam.ac.uk
University Computing Service,            Phone: (01223) 334502
New Museums Site, Pembroke Street,       Fax:   (01223) 334679
Cambridge UK. CB2 3QH.


More information about the Cyrus-devel mailing list