time for cyrus-imap v3.2?

Tue Nov 12 04:20:10 EST 2019

On Tue, Nov 12, 2019, at 14:50, Anatoli wrote:
> Bron,
> 
> The proposed algo is a barrier before any single-lock. In itself it's a
> single lock, but the same code (the pseudocode for the *worker thread*
> in my previous mail) should be inserted at *every* single-lock/write
> operation location. If there's no need to pause, the overhead is
> non-existent. If a pause is requested, all worker threads would pause at
> the entrance to any single-lock/write code.
> 
> It would make the entire Cyrus daemon to complete all pending write
> operations and pause new ones. At this stage, if I understand it
> correctly, the data on disk would be in a consistent state, ready to
> take a snapshot or to perform some other operation.

"complete all pending write operations and pause new ones"

How do you know when the current write operations are finished?

> Without that, if we just take a snapshot of the fs, it could happen that
> a) some files are not written entirely (i.e. caught in the middle of a
> write operation) or b) the contents of some files are newer than the
> other, i.e. the logical write operation was not atomic (e.g. mail data
> is written but indexes are not updated yet or something similar).
> 
> Maybe I didn't understand you correctly. Do you mean that finishing all
> writes and pausing new ones is not enough to guarantee an integral state
> of files on disk? If it's the case, what would have to be done to
> guarantee it (i.e. to make it like Cyrus was shutdown normally)?

I mean that to finish all writes and pause new ones, you need to know that the writes are finished. And not just writes, but sets of writes that are held under a lock together. The way I know to do this is a single global lock with the following properties:

1) every action which locks any file within Cyrus for writing takes a SHARED global lock before it takes the write lock on the file.

2) the SHARED lock is held for the duration of the writes, and released once the writes are finished.

3) the "backup utility" takes an EXCLUSIVE lock on the global lock, which will only be granted once each write is finished. It then takes a snapshot, and releases the EXCLUSIVE lock.

This guarantees full consistency.

The question that always exists for locks is "what granularity" - too wide, and you hold the lock for a long time. Too narrow, and you take and release it very frequently, adding overhead.

My first and most dumbest theory is to go quite wide - add the lock in every runloop and command line utility such that it's held for the entire running of the loop or the utility! Mostly these are done within a fraction of a second. The one place that might be interesting is FETCH 1:* RFC822.PEEK or similar in imapd, where we already have some locking magic that holds a shared namelock on the mailbox to stop repacking while it releases the index lock to allow other actions on the mailbox in the meanwhile.

So we could go down a layer and only lock when we lock mailboxes or cyrusdbs, and refcount the global lock. This seems more likely to be a good layer, and not too horrible.

The other thing is that we'll need to assert that the lock isn't being held during each daemon's command loop, so that bugs don't leak out to deadlock entire servers.

And I think that's nearly it :)

Bron.

--
 Bron Gondwana, CEO, Fastmail Pty Ltd
 brong at fastmailteam.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.andrew.cmu.edu/pipermail/cyrus-devel/attachments/20191112/b2c5ecac/attachment.html>