time for cyrus-imap v3.2?

Wed Nov 6 23:46:41 EST 2019

Bron,

Thanks for your detailed reply and the work the FM team is doing!

> This is not easy unfortunately with all the different datastructures,
> because it means that everything else which takes a lock is going to
> need to first take a global shared lock before it does anything else,
> and that's going to have a performance and complexity impact on
> everything - because you have to find them ALL or you might wind up
> with lock inversions down the line.

One solution I see is to have a separate single lock, not even a lock
per se, but a barrier. I.e. before every write operation there's an
atomic check for a flag that write operations should pause. So the code
checks for the flag to be ON with some atomic operation like
__atomic_load and if it's ON, it sleeps for some milliseconds and tries
again. If the barrier is OFF, it continues. And before even checking for
the barrier, there's an atomic increment (could be something like
__atomic_fetch_add) of current write operations in execution.

The pseudocode for the worker thread before it starts a write operation
would be like this:

__start:

atomic_inc(write_operations_in_execution_counter)

atomic_load(write_barrier)

if (write_barrier == ON) {

  atomic_dec(write_operations_in_execution_counter)
  sleep(100ms)
  goto __start

} else {

  perform_write_operation_with_its_own_locks()
  atomic_dec(write_operations_in_execution_counter)

}

And the code that sets the barrier would look something like this:

atomic_store(write_barrier, ON)

__check_write_counter:

atomic_load(write_operations_in_execution_counter)

if (write_operations_in_execution_counter == 0) {

  sync_data_to_disk()
  signal_data_ready()
  wait_for_lock_release_with_timeout(5s)
  atomic_store(write_barrier, OFF)

} else {

  sleep(1ms)
  goto __check_write_counter

}

So, basically in the normal case the overhead would be about 3 (there
are 3 atomic operations) * 1.35 (the overhead of atomic RAM access
compared to a RAM read with a cache miss) *
pointer_read_with_cache_miss, which is IMO negligible - the overall code
is not so tuned for performance to worry about 4 RAM reads with cache
misses.

There could be a minimal contention when the barrier is set to ON, as
the working threads are incrementing, checking barrier, decrementing the
write_operations_in_execution_counter and the thread that set the
barrier checks it for == 0, but taking into account the timings of the
involved operations like sleeps (with context switches + different sleep
intervals) and the inc, check, dec window, this should not be a problem.
And it could be rewritten a bit to avoid even this small contention.

> Good point - this is something the Greg was close to having done many
> years ago, but we're not using snmp so it hasn't caused us stress.
> Happy to put that on the consideration list for 3.2.

If this is done, I'd try to implement chroot for Cyrus and then pledge &
unveil for OpenBSD build.

> Yep - labeled.  OK, the hard bit here isn't implementing (as ellie
> pointed out) - it's design.  We want to make sure we create an
> interface that people can keep using reliably into the future.  I'll
> have a chat with ellie about this one.

Please let me know if you'd like my feedback once you decide with Ellie
on possible directions.

Thanks!
Anatoli

On 5/11/19 18:20, Bron Gondwana wrote:
> On Wed, Nov 6, 2019, at 03:44, Anatoli via Cyrus-devel wrote:
>> Hi All!
>>
>> Bron, for deployments I manage these issues are also important:
> 
> First of all - thanks for writing this up.  It really helps!
> 
>> * #1763 (Backups for SMB (lock entire server for a moment while taking a
>> snapshot)). Don't know if there was any progress on this. Basically, a
>> short (milliseconds to a few seconds) global write lock is needed on all
>> data structures.
> 
> This is not easy unfortunately with all the different datastructures,
> because it means that everything else which takes a lock is going to
> need to first take a global shared lock before it does anything else,
> and that's going to have a performance and complexity impact on
> everything - because you have to find them ALL or you might wind up with
> lock inversions down the line.
> 
>> * #1765 (Move SNMP out from master into a separate daemon) and related
>> pending PR #2100. Ellie had significant progress on this, don't know
>> what's blocking it, but this issue basically blocks any further work on
>> privilege separation like chroot as the main process should retain root
>> while running and the forked children should proceed with setuid & chroot.
> 
> Good point - this is something the Greg was close to having done many
> years ago, but we're not using snmp so it hasn't caused us stress. 
> Happy to put that on the consideration list for 3.2.
> 
> The downside of making the list of tasks for 3.2 really long is that it
> could block releasing something which is otherwise still a good
> improvement over 3.0 and not a regression... *sigh*.  But this one will
> be a good win, so let's do it!
> 
>> * #2373 (Shared xDAV (CalDAV/CardDAV) resources are not discoverable).
>> Dilyan Palauzov sent a diff for this in the github repo and there was a
>> discussion with Ken on possible implementations (shared xDAV resources):
>> https://lists.andrew.cmu.edu/pipermail/cyrus-devel/2018-May/004263.html.
>> I guess it had enough progress to try to close it.
> 
> Labeled.  I'm keen to have an answer to it somehow or other.
> 
>>
>> * #2372 ([FR] ACL on autocreate folders). Basically, for automatic
>> "anyone p" ACL in plus+addressing folders.
> 
> Yep - labeled.  OK, the hard bit here isn't implementing (as ellie
> pointed out) - it's design.  We want to make sure we create an interface
> that people can keep using reliably into the future.  I'll have a chat
> with ellie about this one.
> 
>> And there are 46 open PRs in the repo. Maybe they could be reviewed and
>> merged too?
> 
> Yeah, maybe!  Frustratingly the next couple of Cyrus call times aren't
> going to work for me, I've got a 7am Melbourne time meeting next
> Tuesday, then I'll be in Singapore for IETF where the Cyrus meeting time
> is 5am.
> 
> One downside of pretty much everyone involved in direct Cyrus
> development being at Fastmail is that we discuss a lot of things in our
> private slack channel or internal mailing lists where we don't have to
> be quite so careful about stripping anything that could identify an
> internal customer... but it does create an impression that there's less
> happening than you'd otherwise see... and I haven't even posted the
> meeting minutes recently because they've been taken into a Dropbox paper
> doc and then langished there :(  Sorry.
> 
> Cheers,
> 
> Bron.
> 
> -- 
>   Bron Gondwana, CEO, Fastmail Pty Ltd
>   brong at fastmailteam.com
> 
>