FOSDEM Report - Saturday

Tue Feb 7 23:08:14 EST 2012

G'day,

Sounds like you're having fun at FOSDEM :)

On Tue, Feb 7, 2012, at 02:27 PM, Bron Gondwana wrote:
> (I'm copying this to the cyrus-devel list because it's of interest
>  to Cyrus people too...)

I'm replying here on the same principle.

> http://fosdem.org/2012/schedule/event/keynotes_welcome
> [...]
> There are 420 talks, 273 hours of scheduled content.  You can't see it
> all!  As much as possible will be videoed.  

Have they announced the URL for the videos?

> Saturday 11:00 -> 2:20 - "Mail Track"
> =====================================
> [...]
> Cyrus process:
> 
> - Gerrit?  Some sort of code review process to make it easier to keep
>   track of the work from drive-by contributers.

I see two benefits to Gerrit or something like it

a) casual contributions don't get lost in the noise

b) regular contributors get regular code reviews

> - Bugzilla - use it for everything.  If it doesn't have a bug number
> where
>   discussion took place, it doesn't get accepted.  This is a major
>   workflow
>   change, and is probably more of a challenge for me than anyone else. 

Sounds good.

Another thing I'd like to see is a push hook on git.cyrusimap.org's master and 2.4 branches, so that if a git commit has the text "Bug NNNN" in the subject then the commit message is copied to Bugzilla and the bug marked RESOLVED - FIXED.

> - Websites in git.

Yes!

> - Release process - simplify to save the repeated typing involved.  I
> wind
>   up writing the changelog, the website release note and the email
>   release
>   note, plus manually changing a bunch of things in the website PHP every
>   time I do a release.

I wonder if DIstZilla or something like it could be used to automate the release process?

> New Features - Conversations, XMOVE, etc:
> 
> - Alexey is willing to help us with the standardisation process if we
> want to
>   push for conversations to become more standard.

Cool!

> Special-Use:
> 
> - Long discussion here.  Kolab currently store a custom "usage"
> annotation,
>   but no other client knows to look for it.  There are two axes:
>   - 1) "this is MY \Sent folder"
>   - 2) "this is a \Calendar folder - it contains calendar entries as
>        encoded attachments.  If I share it, other users should see that"
> - so we probably have to extend special use for private and shared, and
> make
>   sure there's a defined priority order if both exist.  Also, there may
>   be
>   more than one "special use" on the same folder - both \Calendar and a
>   personal \DefaultCalendar (or something)

Yes, our RFC compliance is basic here.

> E-Discovery, deletion controls:
> 
> - Kolab are planning to use the "msg bus" stuff from Worldline to have a
>   listener that collects data for e-discovery.  Kind of a "cyrus
>   watcher".

I'd love to have a more generic infrastructure for listening to changes in, and interacting with, the cyrus data model, on top of which we could implement the Annotator daemon, mupdate, and NOTIFY.

> 
> ActiveSync:
> 
> - MetaWays have a very good open-source ActiveSync stack:
>   http://www.h-online.com/open/news/item/Tine-2-0-supports-ActiveSync-740315.html

That seems to be another PHP implementation like Z-Push.

> Spam Reporting via IMAP:
> 
> - Alexey mentioned that there's talk of adding a command to IMAP to
> report
>   a message as spam/non-spam rather than setting flags.  This would be
>   used
>   to actually take action based on the report.
> - Google and Yahoo are both involved in this effort.

Sounds interesting, any more information?

> 
> Community:
> - there is at least a community of IMAP Server implementors.  There isn't
>   really one for IMAP Client implementors.  They just do their own thing,
>   often just by looking at protocol traces, certainly not bothering to
>   understand the entire RFC stack.

And can you blame them?  It's a jungle.

> http://fosdem.org/2012/schedule/event/thunderbird
> [...]
> I grabbed Ludovic for a few minutes afterwards and outlined our plans for
> a new mail protocol.  He will raise it at their team meeting next week,
> and
> start a discussion about it on their mailing lists (this is already done,
> I
> have joined the tb-planning mailing list)

Cool.

> Mail Protocol - initial notes: (Timo & Bron)
> ==============================
> 
> - Issues - folders vs tags.
> - if a tag can be added/removed, need to change the UID to be compatible
>   with IMAP semantics.
> - 1/1 relationship Tags/Folders - stay compatible?

> - GUIDs for Folders as well, detect renames.  Dovecot and Cyrus both keep
>   a GUID for each folder internally.

Yes!

> - MSGNO/UID/Uidvalidity/etc?  What do we need to keep?  Ordering
> properties
>   are nice.

MSGNO is a completely useless historical artifact that should be buried in lye and forgotten.

We need a stable identifier for each message, preferably one which is truly stable, i.e. doesn't change on reconstruct or replication.  UID+UIDVALIDITY is a sad approximation to this.  RFC822 Message-Id is a better approximation, except for Drafts and other Message-Id-less messages.

>   Definitely 64 bit everything.

Yes!

> - single modseq counter per user?  What about shared folders.  Need to
> get
>   some statistics.
> 
> WISHLIST:
> - simple enough that what client authors "expect" just works.  If they
>   get confused, it's our fault.
> - UTF8 everything.

Yes!  And no quoting crud either.  All strings are non-synchronising literals in UTF-8, except raw message data and annotation data which is binary non-synchronising literals.

> - No Heirarchy Separators!

!?

> - GUID on every message

Yes.

> - Search + Action - a pipeline of events.  Group actions together.

Really???

> - Transactional?  Maybe.
> - Stateless

Ah, you mean no SELECT, no message numbers?  Interesting.

> - Itempotent

Not possible for every operation.

> Timo and I agreed last night that it's better to take the bulk of the
> complexity load on the server side, and make sure we have good tests
> for servers - the client authors should be able to look at a protocol
> dump and wind up creating a client that does the right thing.  There
> will be a lot more clients than servers.

Agreed, entirely!

> Requirements:
> * client and server should be able to present existing interfaces/
>   talk to existing storage with minimal changes and without breaking
>   existing semantics.  You switch to this protocol and you get
>   benefits, yay.  Not downsides/caveats.

Hmm, that's hard to reconcile with the whole tags vs folders thing.

> * Decent test suite for server implementations, and test plan for
>   clients - "X happens on server, you must present world-state Y
>   on action Z" - so implementors can be sure of the quality of their
>   implementations.  Not just a wordy spec.

Cool!

> Some areas of interest:
> 
> Message management:
> * MOVE vs "COPY + expunge".
> * Deletion - in-place set "\Deleted" flag vs copy to trash.

We need a DELETE_MESSAGE command, so the server can do the heavy lifting.

We also need to rename the commands to make it obvious which ones apply to messages and which ones apply to folders.

> * Undo.

Wait, what!?

> * Efficiently detect actions taken by another client.  If another client
>   copies 100,000 messages from Inbox to Archive, don't have to
>   re-download.
>   GUID or similar.

Yes, need to think about the visible data model a lot more carefully.

> * Batching / pipelines.  SEARCH + MARK FLAGGED + MOVE TO ANOTHER MAILBOX
>   - basically, "lego blocks" vs "pre-defined"
>   - leads on to;

OMG, it's NFSv4 all over again.  The horror.

> * Question: do we want full transactional semantics?  

No.  Transaction boundaries are impossible to get completely right, and it's hard to implement and test and scale.

> Stateless Operation:
> * Phones / poorly connected devices
> * Power usage considerations.

Yes!

> Notifications:
> * Able to easily receive notifications about ALL changes of interest,
>   emails / folders / whatever.
> * Notifications still work if connection disconnected (see above)
> * Compatible with out-of-band notification to do cheap resync (use OS
>   remote notification system in case of phone, etc) - if present.  Even
>   SMS.

Yes!  ...somehow.

> A lot of these is the CISC vs RISC debate.  I believe it's better to
> compose your messages from client to server and server to client out
> of groups of small "lego bricks" each of which expresses one thing
> succinctly rather than pre-formed "fighter wing" shapes.

Yeah sure, this seems like a good idea now, but further down this track you will end up sending little bytecode programs from the client to the server for every operation like NFSv4 does.  This actually makes both clients and servers harder to write test and debug.  And, like Lego, everything you want to do is not quite the right shape and doesn't hang together properly.

> The biggest
> lack I see in the current email landscape is that that IMAP clients
> wind up doing convoluted things to support all the possible combinations
> of multiple RFCs out there, or just giving up and supporting a very
> simple profile, because that way they don't need multiple codepaths.

I think a better solution is to explicitly *version* the protocol rather than rely on dozens of interacting extension RFCs.  So, if as a client author you detect v23 of the protocol from the server, you know exactly the set of features the server has.  We still have lots of RFCs rather than try to release one humungous RFC every few years, but there's an explicit ordering and each one assumes all the previous ones.  So the protocol has a single timeline which every point on it expressible as a single small integer, not a combinatorial explosion of capabilities.  This also lets us *deprecate* old features, which is impossible with the current model.  Extensions and capabilities should be only for *experiments* and *vendor-specific additions*.

-- 
Greg.