How to clean up after syncserver

Michael Glad glad at cs.au.dk
Tue Apr 14 03:43:10 EDT 2009


Bron Gondwana wrote:
> On Mon, 13 Apr 2009 21:29 +0200, "Michael Glad" <glad at cs.au.dk> wrote:
>   
>> Cyrus being restarted / sync server abending, apparently causes it to 
>> leave sub dirs in the 'sync.' dirs containing hard links  to messages.
>> They currently sum to 40k+ links on one of my replicas :-( .
>>     
>
> That's a lot of sync server failures!
>
>   
I restart cyrus on my replicas once a day as part of the backup 
procedure. And whatever hard links
that sync_server has left in its sync. subdirs contribute to the 40k count.

And it seems that sync_server has a potential of leaving links:

[root at replica# date; find sync. g/user/glad -inum 136643179 -ls
Tue Apr 14 09:25:44 CEST 2009
136643179    4 -rw-------   3 cyrus    mail          935 Apr 14 09:15 
sync./29594/65.
136643179    4 -rw-------   3 cyrus    mail          935 Apr 14 09:15 
g/user/glad/71229.
136643179    4 -rw-------   3 cyrus    mail          935 Apr 14 09:15 
g/user/glad/Incoming/499.

Why doesn't it (after 10 minutes & counting) remove the sync./29594/65. 
link after successfully having created the
two proper  links (I have a Sieve script that files a copy of all mails 
in the Incoming folder)?

As I write this sentence, 15 minutes have passed and the link is gone. 
But leaving all those links around for extended
periods seems like an invitation for bad  things to happen and for 
garbage to pile up. :-(

>> I noticed them during an yet unsuccessful attempt to find out why 
>> message body inconsistencies now and then occur
>> between master and replica.
>>     
>
> We should have fixed that a while back - but yes, for a little while it would
> just open the hardlinked file and truncate it without unlinking first - causing
> random corruption.  Really, REALLY annoying.  The bug certainly doesn't
> exist in 2.3.14 - I spent quite a while auditing all the paths that open files!
>   
I've browsed the sync_server code and have noticed that care is being 
taken not to overwrite
left-over links but it seems that this is actually still happening. I'm 
quite sure that I see the
problems with 2.3.14. 
>> The 40k+ entries annoy me and I have a uneasy feeling that they may be 
>> involved in the creating the inconsistencies.
>> So is there a way to clean up the  sync. dirs --  can one just remove 
>> the sub dirs / hard links while the sync server is down?
>>     
>
> Yes, you certainly can.
>
>   
>> ------------------------
>> The inconsistencies seem to arise when a user delete a message, thereby 
>> copying it to a trash folder.
>> Now and then, this action on the replica nukes a message file owned by 
>> another user on the same cyrus partition, so
>> that the trash file and the two user mailbox files are now hardlinked 
>> together.
>>     
>
> It's any COPY action actually.  Trash is just a common target for copies.
>
>   
>> Environment: 2.3.14 + FM patches as of 2009-03-31 on RHEL 5.3/x86_64.
>> Single instance store + fast rename + delayed delete + delayed expunge
>>     
>
> Yikes... I really should go audit that fastrename code again :)  We're not
> running it ourselves due to some worries about concurrency safety... I'm
> pretty sure it's OK actually, but it could do with a re-read.
>   
OK.

    -- Michael


More information about the Info-cyrus mailing list