Making Replication Robust

Rob Mueller robm at fastmail.fm
Fri Oct 12 23:18:24 EDT 2007


>> Or is the problem that you have something like:
>>
>> write to file 1
>> write to file 2
>>
>> And if the first returns EINTR but is ignored, and then it writes the 
>> complete data to the second, things are in an inconsistent state?
>
> This is my concern.

Doing an "ack 'write\('" reveals a scary mix of write, retry_write and 
fwrite calls. My initial reaction was that binary files seem to use 
open/retry_write, and text files use fopen/fwrite, but doesn't quite seem to 
be the case...

mailbox.c
1242:    r = write(newheader_fd, MAILBOX_HEADER_MAGIC,
1359:    n = retry_write(mailbox->index_fd, buf, header_size);
1428:    n = retry_write(mailbox->index_fd, buf, INDEX_RECORD_SIZE);
1477:    n = retry_write(mailbox->index_fd, buf, len);
1642:    fwrite(buf, 1, INDEX_HEADER_SIZE, newindex);
1659:            fwrite(bufp, INDEX_RECORD_SIZE, 1, newindex);
1710:        fwrite(buf, INDEX_RECORD_SIZE, 1, newindex);
1721:   fwrite(buf+OFFSET_DELETED,
1952:           n = retry_write(expunge_fd, buf, mailbox->record_size);
1979:       if (newindex) fwrite(buf, 1, mailbox->record_size, newindex);
1999:       /* fwrite will automatically call write() in a sane way */
2000:       fwrite(cacheitembegin, 1, cache_record_size, newcache);
2004:       fwrite(buf, 1, mailbox->record_size, newindex);
2058:    fwrite(buf, 1, mailbox->start_offset, newindex);
2215:   fwrite(buf, 1, sizeof(bit32), newcache);
2219:    fwrite(buf, 1, mailbox->start_offset, newindex);
2263:           n = retry_write(expunge_fd, buf, mailbox->start_offset);
2342:   r = quota_write(&mailbox->quota, &tid);
2363:       fwrite(buf, 1, mailbox->start_offset, newexpungeindex);
2424:   n = retry_write(expunge_fd, buf, mailbox->start_offset);
2719:   n = retry_write(mailbox.cache_fd, (char *)&mailbox.generation_no, 
4);
2823:   r = quota_write(&mailbox->quota, &tid);
3056:   r = quota_write(&(newmailbox->quota), &tid);
3309:   r = quota_write(&newmailbox.quota, &tid);
3319:       r2 = quota_write(&newmailbox.quota, &tid);
3398:    n = retry_write(destfd, src_base, src_size);

It seems mixing up fd's or FILE * structs all over the place. *sigh*

Does fwrite() retry a write on EINTR? It looks like that's the whole point 
of retry_write() anyway.

If fwrite() does retry, then about the only other work would be changing any 
naked write() calls to retry_write(), which actually doesn't seem that many.

Thoughts?

Rob



More information about the Cyrus-devel mailing list