sync_client fails to exit when manual replication and rolling replication are combined (2.3.16-8)

Simon Matter simon.matter at invoca.ch
Tue Oct 19 08:22:06 EDT 2010


> On Saturday, October 16, 2010 12:49 AM, Bron Gondwana wrote (2.3.16-8)
>>
>> On Fri, Oct 15, 2010 at 03:42:21PM -0400, Simpson, John R wrote:
>> > However, if I have run sync_client manually while rolling replication
>> is
>> enabled the rolling replication instance will not exit.  Instead, it
>> appears to start spawning subprocesses and throwing database errors.
>> The
>> change in database errors (below) appears to coincide with the
>> completion
>> of "Exporting cyrus-imapd databases".  The critical DB error messages
>> continue until sync_client is killed.
>>
>> [ ... ]
>>
>> > Oct 15 14:51:41 eml-store04 sync_client[25333]: DBERROR db4: PANIC:
>> fatal region error detected; run recovery
>> > Oct 15 14:51:41 eml-store04 sync_client[25333]: DBERROR: critical
>> database situation
>> > Oct 15 14:51:41 eml-store04 sync_client[25353]: DBERROR db4: PANIC:
>> fatal region error detected; run recovery
>> > Oct 15 14:51:41 eml-store04 sync_client[25353]: DBERROR: critical
>> database situation
>> > ... continue until sync_client is killed ...
>>
>> Nothing magic about sync_client itself here - it's something with the
>> hand
>> run sync_client and attaching/detaching from the environment.  This has
>> been on the TODO list at FastMail for a while - and your information may
>> actually help us narrow down the cause.  We don't use BDB, but the log
>> messages annoy us too!
>
> If there's else anything I can do to help track this down, please let me
> know.  It was interesting to see that the errors were coming from the
> original, rolling replication sync_client process, not a manually
> initiated sync_client that didn't exit properly.
>
> The reason I'm running sync_client manually is to seed the replica with
> the existing users and mailboxes on the master server, as described in
> http://www.cyrusimap.org/docs/cyrus-imapd/2.3.16/install-replication.php.
> Would it be better to use rsync?
>
> Is there any reason not to add code to clean up any remaining sync_client
> processes to the "stop" function in /etc/rc.d/init.d/cyrus-imapd?

Yes, that could get a little tricky because the init script has multi
instance support and so you don't have to only identify sync_client
processes running outside master but also identify which instance they
bwlong to. Beside that, init scripts usually _only_ terminate services
they have started, not anything else.

>
> I am pretty sure we're not using BDB either, but I found a log file,
> /var/lib/imap/db/log.0000000001, that appears to be a Berkeley DB log
> file.

Right, as long as BDBless builds are not possible, we will always see BDB
being initialized, even if not used :(

BTW, I can't help you much with sync_client because I have never used it.
However, I'm quite sure shutting down cyrus-imapd while sync_client is
running may do bad things with your databases. The init script tries to
convert all BDB to skiplist and the cleaning up the BDB environment. I
guess that's not good while sync_client is running.

Simon



More information about the Info-cyrus mailing list