2.3.9 mod time bug

John Capo jc at irbs.com
Sat Sep 1 09:40:03 EDT 2007


Quoting Bron Gondwana (brong at fastmail.fm):
> On Fri, Aug 31, 2007 at 08:19:58PM -0400, John Capo wrote:
> > For as long as I can remember I have sent the file modtime to the
> > sync_server in the protocol stream.  Keeping the modtimes on the
> > master and the replica in sync becomes important when doing rsync
> > snapshots to a backup machine from the replica.  When switching to
> > the replica and then doing backups from the new replica (old master),
> > out of sync modtimes causes rather large data transfers to the
> > backup machine. :-(
> 
> Ahh - this isn't an issue for us because we parse the index files
> and only sync new files.  Cyrus spool files never change - or if
> they do it's a bug!  We optimise that case away and will only
> refetch if the UUID has changed (with md5uuid, this means only
> if the message content has changed)
> 
> > So I though I would try the new internaldate scheme in 2.3.9 for
> > setting modtimes instead of my scheme but it does not work right
> > on FreeBSD 4.11 and probably not on other BSD systems.  Dunno about
> > Linux.
> 
> I'm not sure about to-a-second accuracy.  A few seconds either side
> didn't really worry me.

The modtime is extremely import when using rsync for backups.  Rsync
does not know about index files and UUIDs and such.

> > The calls to utime() (obsolete on BSD) in cmd_upload() set the
> > modtime OK but some data written with fwrite() is still buffered
> > in the stdio buffers.  The buffered data is not flushed to disk
> > till an fclose() call which may be a few to many seconds later.
> > The data written with fclose() sets the modtime overwriting the
> > time set to the internaldate with utime().
> 
> Yeah, I have to admit - now that we parse the internaldate from
> the Received headers anyway we don't care so much so I haven't
> been maintaining and testing this patch.
> 
> I'd be happy to abstract out this section and put in a test in 
> configure.in for the BSD equivalent for utime if it's going to
> be necessary in the near future.  Depends how "obsolete" it is!
>  
> > Another thing, the fsync()/fclose() calls in this loop in
> > sync_message_fsync() should be reversed, at least on on BSD systems.
> > 
> >     /* fsync() files in reverse order: ReiserFS FAQ indicates that this
> >      * gives best potential for optimisation */
> >     for (i = (l->file_count-1) ; i >= 0 ; i--) {
> >         fsync(fileno(l->file[i]));
> >         fclose(l->file[i]);
> >         l->file[i] = NULL;
> >     }
> 
> Presumably this would solve your issue completely, since there would
> be nothing left to flush?

The modtime will still be different from the master because the
function above is called after all files in the batch are transfered.
That could be many seconds later.  I am going to use my scheme that
ensures modtimes are in sync.

> > fsync() is a system call on BSD systems.  It will flush the system
> > I/O buffers but the stdio buffers may still contain unwritten data.
> > fsync() may be hooked into stdio on Linux but not on BSD.
> 
> I'm not sure to be honest - though I imagine this was originally
> designed on Solaris anyway?
> 
> Bron.


More information about the Cyrus-devel mailing list