From kpielorz_lst at tdx.co.uk Mon Nov 12 03:40:39 2018 From: kpielorz_lst at tdx.co.uk (Karl Pielorz) Date: Mon, 12 Nov 2018 08:40:39 +0000 Subject: db5 / PANIC errors under 2.5.12? Message-ID: Hi, I've recently upgraded a system from Cyrus IMAP 2.5.3 I think it was to 2.5.12 - and started seeing problems with the __db.00* files. In the logs we see: Nov 11 22:43:15 imaps[26220]: imaps TLS negotiation failed: some.isp.domain.com [a.b.c.d] Nov 11 22:43:15 imaps[26221]: imaps TLS negotiation failed: some.isp.domain.com [a.b.c.d] Nov 11 22:43:15 imaps[26223]: imaps TLS negotiation failed: some.isp.domain.com [a.b.c.d] Nov 11 22:43:15 imaps[26226]: imaps TLS negotiation failed: some.isp.domain.com [a.b.c.d] Nov 11 22:43:15 imaps[26226]: DBERROR db5: BDB1581 File handles still open at environment close Nov 11 22:43:15 imaps[26226]: DBERROR db5: BDB1582 Open file handle: /vol/host/imap/db/__db.001 Nov 11 22:43:15 imaps[26226]: DBERROR db5: BDB1582 Open file handle: /vol/host/imap/db/__db.002 Nov 11 22:43:15 imaps[26226]: DBERROR db5: BDB1582 Open file handle: /vol/host/imap/db/__db.003 Nov 11 22:43:15 imaps[26225]: DBERROR db5: BDB0061 PANIC: Invalid argument Nov 11 22:43:15 imaps[26226]: DBERROR db5: BDB0060 PANIC: fatal region error detected; run recovery Nov 11 22:43:15 imaps[26225]: DBERROR: critical database situation Nov 11 22:43:15 imaps[26226]: DBERROR: critical database situation Nov 11 22:43:15 imaps[26232]: inittls: Loading DH parameters from file Nov 11 22:43:15 master[1152]: process type:SERVICE name:imaps path:/usr/local/cyrus/bin/imapd age:0.184s pid:26225 exited, status 75 Nov 11 22:43:15 master[1152]: process type:SERVICE name:imaps path:/usr/local/cyrus/bin/imapd age:0.169s pid:26226 exited, status 75 Nov 11 22:43:15 imaps[26233]: DBERROR db5: BDB0060 PANIC: fatal region error detected; run recovery The a.b.c.d IP address is nothing to do with the server - I think it's someone 'trying' to log in with incorrect / guessed credentials. What I can't tell is: - Are the failed logins causing PID 26226 to die in someway that is corrupting the database? - Or is the database just getting itself corrupt? We're running FreeBSD 11.2 - there's no dmesg / disk issues I can see. I don't currently get any core dumps either. The fix for the above is to shutdown Cyrus, 'rm /vol/host/imap/db/*' - and restart it. It'll be a few weeks now (typically) before this happens again. Anyone got any suggests as to where to look for more info / possible causes / solutions? Thanks, -Karl From brong at fastmailteam.com Mon Nov 12 06:31:26 2018 From: brong at fastmailteam.com (Bron Gondwana) Date: Mon, 12 Nov 2018 06:31:26 -0500 Subject: Notes Nov 12 Message-ID: <50cc8253-4a80-4f0d-bd40-551f103092bc@sloti7d1t02> Present: ellie, Bron, Robert, Ken, Partha Bron: * was at IETF last week * JMAP is close to last call and will probably recharter for calendars/contacts early next year * EXTRA is also close to rechartering, last major work is IMAP4rev2. * have built a package with the rename intermediate patch, but not yet been able to test it. Robert: * JSCalendar - in last call for RFC now. - New version last week. - One update pending: updating RRULEs to better define non-Gregorian RSCALEs. - also a feature branch for Cyrus with latest spec version for testing * pushed keyword updates to mboxset update * main current project is how best to proceed with conversationsdb. - two current issues open due to not including expunged state of messages. - can't search for multiple ANDed "inMailboxId" filter, or showing deleted messages in query results. - have a fix which adds these flags to conversationsdb. - prototyping a workaround to handle legacy records, but it's really a pain. - as long as not rebuilt, might still return deleted messages in query results. - go ahead and switch to binary format as it's more efficient. * Working on smaller JMAP bugs. - there was an issue that we closed with accepting message IDs not according to spec. Now issue where they actually exist, so need to make it more lenient so we can handle real world messages. * haven't started working on Quota/get - that's next. ellie: * imap labels over fetch is pretty close, hopefully can just bang it out tomorrow! * it'll definitely benefit from robert's convdb additions, but the merge conflicts will be trivial to resolve Ken: * working on mailbox history and tombstones * will probably rebase path-by-id onto rename intermediate * spent time working on drafts - need to do more on FCC draft this week * with mailbox-paths-by-id, need to keep mailbox name in the cyrus.header in case we lose mailboxes.db. * is it cheaper to use symlinks on disk? - Bron isn't in favour, couple of issues - disk path lengths and consistency/cleanup. - also - update mailboxes.db twice - once at start saying what will happen, then afterwards once the on disk files have been updated. - that way, we always know which mailboxes might be in an unknown state and can be repaired. * one last release candidate for SASL - build issues and Coverity issues. - planning to release this week. Partha: * no cyrus work last week or coming up in the next week. >From next week, switching to 8am Tuesday Melbourne time (4pm US East, 10pm Vienna) -- Bron Gondwana, CEO, FastMail Pty Ltd brong at fastmailteam.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kpielorz_lst at tdx.co.uk Thu Nov 15 14:39:47 2018 From: kpielorz_lst at tdx.co.uk (Karl Pielorz) Date: Thu, 15 Nov 2018 19:39:47 +0000 Subject: db5 / PANIC errors under 2.5.12? In-Reply-To: References: Message-ID: <2BE48889C65A24BEBD0208E1@Mac-mini.local> --On 12 November 2018 at 08:40:39 +0000 Karl Pielorz wrote: > It'll be a few weeks now (typically) before this happens again. This just happened again - but doesn't appear to be related to connections we just had logged: " Nov 15 18:31:00 imap[84552]: DBERROR db5: pthread suspend failed: Invalid argument Nov 15 18:31:00 imap[84552]: DBERROR db5: BDB0061 PANIC: Invalid argument Nov 15 18:31:00 imap[84552]: DBERROR: critical database situation ... Nov 15 18:31:00 imap[84554]: DBERROR db5: BDB0060 PANIC: fatal region error detected; run recovery " Again, shutdown, clear out 'db/*' and restart - and it's back. The files 'db': -rw------- 1 cyrus cyrus 14770176 Nov 15 19:14 __db.001 -rw------- 1 cyrus cyrus 32768 Nov 15 19:14 __db.002 -rw------- 1 cyrus cyrus 49152 Nov 15 19:14 __db.003 -rw------- 1 cyrus cyrus 4 Nov 15 19:14 skipstamp Appear to just be 'binary' - i.e. 'files' against them shows binary - are these really the files likely to be causing the problem? Or is the 'DBERROR db5' still relating to a .db file somewhere? - Be handy if the 'fatal region' error [or any of them] could list the file the problem is in :) Just really looking for any hints where to look next... -Karl From ellie at fastmail.com Thu Nov 15 21:52:50 2018 From: ellie at fastmail.com (ellie timoney) Date: Fri, 16 Nov 2018 13:52:50 +1100 Subject: db5 / PANIC errors under 2.5.12? In-Reply-To: <2BE48889C65A24BEBD0208E1@Mac-mini.local> References: <2BE48889C65A24BEBD0208E1@Mac-mini.local> Message-ID: <1542336770.1418805.1578746024.248B77F2@webmail.messagingengine.com> Hi Karl, I'm not sure if you're aware, but we've removed support for Berkeley DB from 3.0+. It remains in the 2.5 series mainly for legacy reasons, but no-one's specifically maintaining support for it (especially not for newer versions like db5). Since you've recently upgraded from 2.3 to 2.5, it might be a good opportunity to migrate your Cyrus databases to one of the other database backends such as twoskip or skiplist? It looks like the problems you are having are specifically in the berkeley backend so I think they should be resolved just by switching to other backends. The 'cvt_cyrusdb' tool is able to convert database formats, and you'll also want to update your various foo_db settings in imapd.conf to specify your new database format. If you need to stick with Berkeley DB for some reason, the info-cyrus list might have active subscribers that use this database and might be able to provide guidance (though I'm not directly aware of any) Hope this helps :) ellie On Fri, Nov 16, 2018, at 6:39 AM, Karl Pielorz wrote: > > > --On 12 November 2018 at 08:40:39 +0000 Karl Pielorz > wrote: > > > It'll be a few weeks now (typically) before this happens again. > > This just happened again - but doesn't appear to be related to connections > we just had logged: > > " > Nov 15 18:31:00 imap[84552]: DBERROR db5: pthread suspend failed: Invalid > argument > Nov 15 18:31:00 imap[84552]: DBERROR db5: BDB0061 PANIC: Invalid argument > Nov 15 18:31:00 imap[84552]: DBERROR: critical database situation > ... > Nov 15 18:31:00 imap[84554]: DBERROR db5: BDB0060 PANIC: fatal region error > detected; run recovery > " > > Again, shutdown, clear out 'db/*' and restart - and it's back. The files > 'db': > > -rw------- 1 cyrus cyrus 14770176 Nov 15 19:14 __db.001 > -rw------- 1 cyrus cyrus 32768 Nov 15 19:14 __db.002 > -rw------- 1 cyrus cyrus 49152 Nov 15 19:14 __db.003 > -rw------- 1 cyrus cyrus 4 Nov 15 19:14 skipstamp > > Appear to just be 'binary' - i.e. 'files' against them shows binary - are > these really the files likely to be causing the problem? > > Or is the 'DBERROR db5' still relating to a .db file somewhere? - Be handy > if the 'fatal region' error [or any of them] could list the file the > problem is in :) > > Just really looking for any hints where to look next... > > -Karl From kpielorz_lst at tdx.co.uk Fri Nov 16 03:22:56 2018 From: kpielorz_lst at tdx.co.uk (Karl Pielorz) Date: Fri, 16 Nov 2018 08:22:56 +0000 Subject: db5 / PANIC errors under 2.5.12? In-Reply-To: <1542336770.1418805.1578746024.248B77F2@webmail.messagingengine.com> References: <2BE48889C65A24BEBD0208E1@Mac-mini.local> <1542336770.1418805.1578746024.248B77F2@webmail.messagingengine.com> Message-ID: <3B8ACC1C71CB91F977F011B5@[10.12.30.106]> --On 16 November 2018 13:52 +1100 ellie timoney wrote: > Since you've recently upgraded from 2.3 to 2.5, it might be a good > opportunity to migrate your Cyrus databases to one of the other database > backends such as twoskip or skiplist? It looks like the problems you are > having are specifically in the berkeley backend so I think they should be > resolved just by switching to other backends. The 'cvt_cyrusdb' tool is > able to convert database formats, and you'll also want to update your > various foo_db settings in imapd.conf to specify your new database format. Hi, Thanks for the reply - we recently moved from 2.5.3 I think it was to 2.5.12 - i.e. within 2.5.x. As far as I'm aware - all our DB's are already twoskip - certainly running "files" against anything ending in '.db' returns "Cyrus twoskip DB" - the only ones that don't, are the files in the 'db' directory - these return as a mix of 'binary' or 'Applesoft BASIC' and other weirdness - as 'file' doesn't know what they are - I don't either, all I know is after the 'db5' PANIC messages - shutting down, then clearing out __db.001, __db.002, __db.003 and 'skipstamp' and restarting appears to fix the problem for a while. > If you need to stick with Berkeley DB for some reason, the info-cyrus > list might have active subscribers that use this database and might be > able to provide guidance (though I'm not directly aware of any) Hehe, no we don't need to stick with Berkeley DB - and, as I said - as far as I'm aware, we're entirely twoskip based. Using Google I was able to find one other reference to 'DBERROR db5: pthread suspend failed: Invalid argument' - again, DB compatibility was mentioned (and apparently ruled out) - leaving a question over, maybe it's a bug? I could really do with someone confirming the files in 'db/' - are 'fine as they are' (i.e. should not return a file type of 'twoskip' etc.) - and any other suggestions for what might be causing this - or any way of getting some more debug info. We have two IMAP servers on identical hardware - only the one that's heavily loaded encounters this issue (so far). Regards, -Karl From brong at fastmailteam.com Mon Nov 19 03:32:26 2018 From: brong at fastmailteam.com (Bron Gondwana) Date: Mon, 19 Nov 2018 03:32:26 -0500 Subject: REMINDER: meeting time changed! Message-ID: <9c278f31-22f0-421f-b261-9974736e0849@sloti7d1t02> The meetings will now be held at 2100 GMT on Monday nights, aka: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2018&month=11&day=19&hour=21&min=0&sec=0&p1=152&p2=179&p3=259 8am Melbourne, 4pm New York, 10pm Vienna Also advance notice for this next meeting, I have some weighty things I want to discuss: 1) Intermediate folders. RFC3501 is very inconsistent, it says that servers SHOULD create parent folders (which Cyrus doesn't do) when creating a deep folder, but also says MUST NOT delete children when deleting a folder, instead the deleted folder becomes \Noselect. We want to get rid of the "INTERMEDIATE" type unless we can make is really consistent, including disappearing when the last child disappears. 2) Instructions for building with websockets (ho hum) - this is really just Fastmail magic. 3) This is the big one! Cyrusdb "readonly" mode. There's no way to long lock in SHARED, which means that every JMAP call takes an exclusive lock on the entire user, even if it's just reading. We accepted this when we switched to conversations way back when, but it's hurting more and more, and it removes the ability to parallelise. I was planning to hold this conversation off until we had zeroskip done, but hey - zeroskip is alpha merged now, and I want this very very much! Bron. -- Bron Gondwana, CEO, FastMail Pty Ltd brong at fastmailteam.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From murch at fastmail.com Mon Nov 19 21:30:16 2018 From: murch at fastmail.com (Ken Murchison) Date: Mon, 19 Nov 2018 21:30:16 -0500 Subject: Notes Nov 19 Message-ID: Present: ellie, Bron, Robert, Ken, Partha, Rik Bron: * we shouldn't allow \NoSelect mailboxes - causes JMAP issues: always create parents where needed; don't allow deleting a parent * we need to implement shared locks for cyrusdb interface.? we already have some of the needed APIs * will fix JMAP savedDates vs addedDates typo Robert: * working on JMAP Quota extension * also working through existing JMAP issues * issue 2576 should be fixed on master, but requires a conversations.db rebuild - backout commit that caused the cross-mailbox search issue until conversations rebuilds are done ellie: * no report Ken: * will annouce SASL 2.1.27 tonight or tomorrow * working on mbpath-by-id.? 96% of Cassandane tests pass.? still working through RENAME issues. most of the other failures are hard-coded path issues Partha: * no report -- Ken Murchison Cyrus Development Team FastMail US LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: murch.vcf Type: text/x-vcard Size: 4 bytes Desc: not available URL: From murch at fastmail.com Tue Nov 20 10:00:56 2018 From: murch at fastmail.com (Ken Murchison) Date: Tue, 20 Nov 2018 10:00:56 -0500 Subject: SASL 2.1.27 Message-ID: <9a09ebfd-0fa8-fc0d-b707-4133227cb0b9@fastmail.com> All, I'm pleased to announce the release of the long-awaited SASL 2.1.27 which can be downloaded from here: * HTTP: https://www.cyrusimap.org/releases/cyrus-sasl-2.1.27.tar.gz https://www.cyrusimap.org/releases/cyrus-sasl-2.1.27.tar.gz.sig * FTP: ftp://ftp.cyrusimap.org/cyrus-sasl/cyrus-sasl-2.1.27.tar.gz ftp://ftp.cyrusimap.org/cyrus-sasl/cyrus-sasl-2.1.27.tar.gz.sig md5sum: a33820c66e0622222c5aefafa1581083 cyrus-sasl-2.1.27.tar.gz b295313b9915be32b334f7e88f30dacd cyrus-sasl-2.1.27.tar.gz.sig The (mostly) complete list of changes from 2.1.26 are these: * Added support for OpenSSL 1.1 * Added support for lmdb (from Howard Chu) * Lots of build fixes (from Ignacio Casal Quinteiro and others) * Treat SCRAM and DIGEST-MD5 as more secure than PLAIN when selecting client mech * DIGEST-MD5 plugin: o Fixed memory leaks o Fixed a segfault when looking for non-existent reauth cache o Prevent client from going from step 3 back to step 2 o Allow cmusaslsecretDIGEST-MD5 property to be disabled * GSSAPI plugin: o Added support for retrieving negotiated SSF o Properly compute maxbufsize AFTER security layers have been set * SCRAM plugin: o Added support for SCRAM-SHA-256 o Allow SCRAM-* to be used by HTTP * LOGIN plugin: o Don?t prompt client for password until requested by server * NTLM plugin: o Fixed crash due to uninitialized HMAC context * saslauthd: o cache.c: + Don?t use cached credentials if timeout has expired + Fixed debug logging output o ipc_doors.c: + Fixed potential DoS attack (from Oracle) o ipc_unix.c: + Prevent premature closing of socket o auth_rimap.c: + Added support LOGOUT command + Added support for unsolicited CAPABILITY responses in LOGIN reply + Properly detect end of responses (don?t needlessly wait) + Properly handle backslash in passwords o auth_httpform: + Fix off-by-one error in string termination + Added support for 204 success response o auth_krb5.c: + Added krb5_conv_krb4_instance option + Added more verbose error logging -- Ken Murchison Cyrus Development Team FastMail US LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: murch.vcf Type: text/x-vcard Size: 4 bytes Desc: not available URL: From kpielorz_lst at tdx.co.uk Wed Nov 21 05:15:28 2018 From: kpielorz_lst at tdx.co.uk (Karl Pielorz) Date: Wed, 21 Nov 2018 10:15:28 +0000 Subject: db5 / PANIC errors under 2.5.12? In-Reply-To: <2BE48889C65A24BEBD0208E1@Mac-mini.local> References: <2BE48889C65A24BEBD0208E1@Mac-mini.local> Message-ID: This just happened again, on a different server... Having spent more time looking at this - the error: Nov 21 07:45:30 imaps[72501]: DBERROR db5: BDB1581 File handles still open at environment close I'm guessing "db5" is indeed Berkeley DB5 - so even though in our imapd.conf we don't specify 'berkeley' anywhere (and 'files' on all the .db files confirms - they're all twoskip) - the database being created in '/vol/host/imap/db/' is likely still berkeley. How can I change this? - In imapd.conf I can see a myriad of 'seenstate_db', 'mboxlist_db' - but I can't see anything that relates to the files in that DB directory. To be honest - I can't see anywhere what that database is actually called (which might help me track down how to change it's format - if you can?) Does anyone know what the database that lives off of 'db/' is actually called / used for? - And how I can change it's format, that is presuming the issues I'm seeing are caused by it still being berkeley? -Karl From dilyan.palauzov at aegee.org Fri Nov 23 07:00:20 2018 From: dilyan.palauzov at aegee.org (=?UTF-8?Q?=D0=94=D0=B8=D0=BB=D1=8F=D0=BD_?= =?UTF-8?Q?=D0=9F=D0=B0=D0=BB=D0=B0=D1=83=D0=B7=D0=BE=D0=B2?=) Date: Fri, 23 Nov 2018 12:00:20 +0000 Subject: db5 / PANIC errors under 2.5.12? In-Reply-To: References: <2BE48889C65A24BEBD0208E1@Mac-mini.local> Message-ID: <0e0e77da2d0dbd10998cedb0a08dab1ae6918f37.camel@aegee.org> Hello Karl, man imapd.conf shows the default database types, e.g. check for annotation_db, duplicate_db? You likely have there explicit or implicit berkeley database. With cvt_cyrusdb you can switch between database formats. You use the latter to convert to e.g. twoskip and then update imapd.conf to state, that the database is in twoskip format. On my system the files in imap/db/ were not changed since 2016, I guess when I upgraded at that time. Regards ????? On Wed, 2018-11-21 at 10:15 +0000, Karl Pielorz wrote: > This just happened again, on a different server... > > Having spent more time looking at this - the error: > > Nov 21 07:45:30 imaps[72501]: DBERROR db5: BDB1581 File handles still > open at environment close > > I'm guessing "db5" is indeed Berkeley DB5 - so even though in our > imapd.conf we don't specify 'berkeley' anywhere (and 'files' on all the .db > files confirms - they're all twoskip) - the database being created in > '/vol/host/imap/db/' is likely still berkeley. > > How can I change this? - In imapd.conf I can see a myriad of > 'seenstate_db', 'mboxlist_db' - but I can't see anything that relates to > the files in that DB directory. > > To be honest - I can't see anywhere what that database is actually called > (which might help me track down how to change it's format - if you can?) > > Does anyone know what the database that lives off of 'db/' is actually > called / used for? - And how I can change it's format, that is presuming > the issues I'm seeing are caused by it still being berkeley? > > -Karl From rsto at fastmailteam.com Mon Nov 26 16:40:36 2018 From: rsto at fastmailteam.com (Robert Stepanek) Date: Mon, 26 Nov 2018 22:40:36 +0100 Subject: Notes - Nov 26 2018 Message-ID: <1543268436.1830375.1589662536.41484754@webmail.messagingengine.com> Participants: Bron, Ellie, Ken, Partha, Robert Ken: - Short week for US. - Mainly working on mailbox by id. Some Cassandane tests fail, more a matter of fixing them instead Cyrus code.- Annotations.db needs to use ids instead of mailbox names. Same for conversations.db.- RENAME should become fairly trivial, with reduced IO overhead. - Before continuing with mailbox by id : need to update RFC drafts. - Issue with building a SASL distribution on Fedora (due to Sphinx) Bron: - New Cyrus rolled out yesterday, along with changes to frontend and middleware.- Fixed JMAP bugs that crept in with commits last week. Ellie: - IMAP labels patch updated after review, ready to merge. - SASL: distcheck is flaky on Windows. Haven't looked into it, yet. Robert: - Short week as well, was ill. - Fixed JMAP bugs and updated JSCalendar v8 implementation. - JSCalendar RFC is in last call at IETF! - JMAP Quota patch updated. JMAP part is trivial, but quotamodseq touches lots of critical areas: sync, mboxevent, counters. Bron volunteered to review.- Now working on read-only cyrusdb locks: only for cyrusdb_open and twoskip. Partha: - Zeroskip is part of cyruslibs. Unit tests checked on cyrus.works. - How to inspect zeroskip binary format for debugging? Either use zeroskip- inluded tooling to dump db, or use cvt_cyrusdb binary. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ellie at fastmail.com Mon Nov 26 21:20:16 2018 From: ellie at fastmail.com (ellie timoney) Date: Tue, 27 Nov 2018 13:20:16 +1100 Subject: Cyrus IMAPd version 3.1.6 Message-ID: <1543285216.230612.1589909864.4DBB3B10@webmail.messagingengine.com> The Cyrus team is pleased to announce the immediate availability of a new version of Cyrus IMAP: 3.1.6 This is a snapshot of the master branch, and should be considered for testing purposes and bleeding-edge features only. It is available as a git tag, which can be found here: https://github.com/cyrusimap/cyrus-imapd/releases/tag/cyrus-imapd-3.1.6 Join us on Github at https://github.com/cyrusimap/cyrus-imapd to report issues, join in the deliberations of new features for the next Cyrus IMAP release, and to contribute to the documentation. On behalf of the Cyrus team, ellie -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at anatoli.ws Tue Nov 27 07:50:03 2018 From: me at anatoli.ws (Anatoli) Date: Tue, 27 Nov 2018 09:50:03 -0300 Subject: Notes - Nov 26 2018 In-Reply-To: <1543268436.1830375.1589662536.41484754@webmail.messagingengine.com> References: <1543268436.1830375.1589662536.41484754@webmail.messagingengine.com> Message-ID: Hi All! Robert, is your work on read-only cyrusdb locks somehow related to the global lock feature (https://github.com/cyrusimap/cyrus-imapd/issues/1763)? Regards, Anatoli *From:* Robert Stepanek *Sent:* Monday, November 26, 2018 18:40 *To:* Cyrus Devel *Subject:* Notes - Nov 26 2018 Participants: Bron, Ellie, Ken, Partha, Robert Ken: - Short week for US. - Mainly working on mailbox by id. Some Cassandane tests fail, more a matter of fixing them instead Cyrus code. - Annotations.db needs to use ids instead of mailbox names. Same for conversations.db. - RENAME should become fairly trivial, with reduced IO overhead. - Before continuing with mailbox by id : need to update RFC drafts. - Issue with building a SASL distribution on Fedora (due to Sphinx) Bron: - New Cyrus rolled out yesterday, along with changes to frontend and middleware. - Fixed JMAP bugs that crept in with commits last week. Ellie: - IMAP labels patch updated after review, ready to merge. - SASL: distcheck is flaky on Windows. Haven't looked into it, yet. Robert: - Short week as well, was ill. - Fixed JMAP bugs and updated JSCalendar v8 implementation. - JSCalendar RFC is in last call at IETF! - JMAP Quota patch updated. JMAP part is trivial, but quotamodseq touches lots of critical areas: sync, mboxevent, counters. Bron volunteered to review. - Now working on read-only cyrusdb locks: only for cyrusdb_open and twoskip. Partha: - Zeroskip is part of cyruslibs. Unit tests checked on cyrus.works. - How to inspect zeroskip binary format for debugging? Either use zeroskip-inluded tooling to dump db, or use cvt_cyrusdb binary. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsto at fastmailteam.com Tue Nov 27 08:05:47 2018 From: rsto at fastmailteam.com (Robert Stepanek) Date: Tue, 27 Nov 2018 14:05:47 +0100 Subject: Notes - Nov 26 2018 In-Reply-To: References: <1543268436.1830375.1589662536.41484754@webmail.messagingengine.com> Message-ID: <1543323947.2361780.1590389072.7A91CEAE@webmail.messagingengine.com> Hi Anatoli, no, what I am working is rather the opposite: the current implementation uses exclusive locks on mailboxes where it could do away with a shared read-only lock. That's because the Cyrus DB API currently does not support shared locks. We want to reduce latency of requests, and that's what I am optimizing for. The work only touches the twoskip backend and is only for certain hot-paths in the code. Cheers, Robert On Tue, Nov 27, 2018, at 1:50 PM, Anatoli wrote: > Hi All! > > Robert, is your work on read-only cyrusdb locks somehow related to > the global lock feature > (https://github.com/cyrusimap/cyrus-imapd/issues/1763)?> > Regards, > Anatoli > > *From:* Robert Stepanek > *Sent:* Monday, November 26, 2018 18:40 > *To:* Cyrus Devel > *Subject:* Notes - Nov 26 2018 > > Participants: Bron, Ellie, Ken, Partha, Robert > > Ken: > - Short week for US. > - Mainly working on mailbox by id. Some Cassandane tests fail, more a > matter of fixing them instead Cyrus code.> - Annotations.db needs to use ids instead of mailbox names. Same for > conversations.db.> - RENAME should become fairly trivial, with reduced IO overhead. > - Before continuing with mailbox by id : need to update RFC drafts. > - Issue with building a SASL distribution on Fedora (due to Sphinx) > > Bron: > - New Cyrus rolled out yesterday, along with changes to frontend and > middleware.> - Fixed JMAP bugs that crept in with commits last week. > > Ellie: > - IMAP labels patch updated after review, ready to merge. > - SASL: distcheck is flaky on Windows. Haven't looked into it, yet. > > Robert: > - Short week as well, was ill. > - Fixed JMAP bugs and updated JSCalendar v8 implementation. > - JSCalendar RFC is in last call at IETF! > - JMAP Quota patch updated. JMAP part is trivial, but quotamodseq > touches lots of critical areas: sync, mboxevent, counters. Bron > volunteered to review.> - Now working on read-only cyrusdb locks: only for cyrusdb_open and > twoskip.> > Partha: > - Zeroskip is part of cyruslibs. Unit tests checked on cyrus.works. > - How to inspect zeroskip binary format for debugging? Either use zeroskip- > inluded tooling to dump db, or use cvt_cyrusdb binary. -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at anatoli.ws Tue Nov 27 08:11:57 2018 From: me at anatoli.ws (Anatoli) Date: Tue, 27 Nov 2018 10:11:57 -0300 Subject: Notes - Nov 26 2018 In-Reply-To: <1543323947.2361780.1590389072.7A91CEAE@webmail.messagingengine.com> References: <1543268436.1830375.1589662536.41484754@webmail.messagingengine.com> <1543323947.2361780.1590389072.7A91CEAE@webmail.messagingengine.com> Message-ID: <4a9fb6db-7277-0c8f-b9ab-690596ec8ae8@anatoli.ws> Got it, thanks for the explanation! *From:* Robert Stepanek *Sent:* Tuesday, November 27, 2018 10:05 *To:* Cyrus Devel *Subject:* Re: Notes - Nov 26 2018 Hi Anatoli, no, what I am working is rather the opposite: the current implementation uses exclusive locks on mailboxes where it could do away with a shared read-only lock. That's because the ?Cyrus DB API currently does not support shared locks. We want to reduce latency of ?requests, and that's what I am optimizing for. The work only touches the twoskip backend and is only for certain hot-paths in the code. Cheers, Robert On Tue, Nov 27, 2018, at 1:50 PM, Anatoli wrote: > Hi All! > > Robert, is your work on read-only cyrusdb locks somehow related to the > global lock feature > (https://github.com/cyrusimap/cyrus-imapd/issues/1763)? > > Regards, > Anatoli > > *From:* Robert Stepanek > *Sent:* Monday, November 26, 2018 18:40 > *To:* Cyrus Devel > *Subject:* Notes - Nov 26 2018 > > Participants: Bron, Ellie, Ken, Partha, Robert > > Ken: > - Short week for US. > - Mainly working on mailbox by id. Some Cassandane tests fail, more a > matter of fixing them instead Cyrus code. > - Annotations.db needs to use ids instead of mailbox names. Same for > conversations.db. > - RENAME should become fairly trivial, with reduced IO overhead. > - Before continuing with mailbox by id : need to update RFC drafts. > - Issue with building a SASL distribution on Fedora (due to Sphinx) > > Bron: > - New Cyrus rolled out yesterday, along with changes to frontend and > middleware. > - Fixed JMAP bugs that crept in with commits last week. > > Ellie: > - IMAP labels patch updated after review, ready to merge. > - SASL: distcheck is flaky on Windows. Haven't looked into it, yet. > > Robert: > - Short week as well, was ill. > - Fixed JMAP bugs and updated JSCalendar v8 implementation. > - JSCalendar RFC is in last call at IETF! > - JMAP Quota patch updated. JMAP part is trivial, but quotamodseq > touches lots of critical areas: sync, mboxevent, counters. Bron > volunteered to review. > - Now working on read-only cyrusdb locks: only for cyrusdb_open and > twoskip. > > Partha: > - Zeroskip is part of cyruslibs. Unit tests checked on cyrus.works. > - How to inspect zeroskip binary format for debugging? Either use > zeroskip-inluded tooling to dump db, or use cvt_cyrusdb binary. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brong at fastmailteam.com Thu Nov 29 22:26:35 2018 From: brong at fastmailteam.com (Bron Gondwana) Date: Thu, 29 Nov 2018 22:26:35 -0500 Subject: MMAP performance and using mmap writes Message-ID: <4c000ff3-597b-4f62-8ac0-ddfe72c3bd09@sloti7d1t02> Hi All, We were debugging the CPU usage in a ctl_conversationsdb rebuild yesterday, and noticed an interesting thing. 70% of the CPU utilisation for this one process was inside the kernel! Mostly with dirty pages. ctl_conversationsdb -R is particularly heavy on the twoskip database - it's rewriting a lot of random keys. This leads to writes all over the place, as it stitches records into the skiplists. Of course the "real answer"[tm] is zeroskip, which doesn't do random writes - but until then, we suspect that the cost is largely due to the face that we use mmap to read, and fwrite to write! We know that might be less efficient already from Linus' comments about 10 years ago! And I guess here's the proof. An option would be to switch to using mmap to write as well. We could easily modify lib/mappedfile to memcpy to do the writes. Does anybody see any strong reason not to? Bron. -- Bron Gondwana, CEO, FastMail Pty Ltd brong at fastmailteam.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hyc at highlandsun.com Fri Nov 30 06:09:23 2018 From: hyc at highlandsun.com (Howard Chu) Date: Fri, 30 Nov 2018 11:09:23 +0000 Subject: MMAP performance and using mmap writes In-Reply-To: <4c000ff3-597b-4f62-8ac0-ddfe72c3bd09@sloti7d1t02> References: <4c000ff3-597b-4f62-8ac0-ddfe72c3bd09@sloti7d1t02> Message-ID: <1c46d2af-c16f-aeb3-9e32-291d3ac4e216@highlandsun.com> Bron Gondwana wrote: > Hi All, > > We were debugging the CPU usage in a ctl_conversationsdb rebuild yesterday, and noticed an interesting thing.? 70% of the CPU utilisation for this one process > was inside the kernel!? Mostly with dirty pages. > > ctl_conversationsdb -R is particularly heavy on the twoskip database - it's rewriting a lot of random keys.? This leads to writes all over the place, as it > stitches records into the skiplists. > > Of course the "real answer"[tm] is zeroskip, which doesn't do random writes - but until then, we suspect that the cost is largely due to the face that we use > mmap to read, and fwrite to write!? We know that might be less efficient already from Linus' comments about 10 years ago!? And I guess here's the proof. > > An option would be to switch to using mmap to write as well.? We could easily modify lib/mappedfile to memcpy to do the writes. > > Does anybody see any strong reason not to? I've covered the reasons for/against writing thru mmap in my LMDB design papers. I don't know how relevant all of these are for your use case: 1: writing thru mmap loses any control over write ordering - the OS will page dirty pages out in arbitrary order. If you're using a filesystem that supports ordered writes, it will preserve the ordering of data from write() calls. 2: making the mmap writable opens the possibility of undetectable data structure corruption if any other code is doing stray writes through arbitrary pointers. You need to be very sure your code is bug-free. 3: if your DB is larger than RAM, writing thru mmap is slower than using write() syscalls. Whenever you access a page for the first time, the OS will page it in. This is a wasted I/O if all you're doing is overwriting the page with new data. 4: you can't use mmap exclusively, if you need to grow the output file. You can only write thru the mapping to pages that already exist. If you need to grow the file, you must preallocate the space, otherwise you get a SEGV when referencing unallocated pages. And a side note, multiple studies have shown that skiplists are not cache-friendly, and thus have inferior performance to B+tree organizations. A skiplist is a very poor choice for a read/write data structure. Obviously I would recommend you use something carefully designed and heavily tested, like LMDB, instead of whatever you're using. There's one point in favor of writing thru mmap - if you take care of all the other potential gotchas, it will work on every OS that implements mmap. Using mmap for reads, and syscalls for writes, is only valid on OSs with a unified buffer cache. While this isn't a problem on most modern OSs, OpenBSD is a notable example of an OS that lacks this, and so that approach always results in file corruption there. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ From brong at fastmailteam.com Fri Nov 30 06:42:35 2018 From: brong at fastmailteam.com (Bron Gondwana) Date: Fri, 30 Nov 2018 06:42:35 -0500 Subject: MMAP performance and using mmap writes In-Reply-To: <1c46d2af-c16f-aeb3-9e32-291d3ac4e216@highlandsun.com> References: <4c000ff3-597b-4f62-8ac0-ddfe72c3bd09@sloti7d1t02> <1c46d2af-c16f-aeb3-9e32-291d3ac4e216@highlandsun.com> Message-ID: <1d2f1a25-d31d-4465-8025-5c084b0014fe@sloti7d1t02> On Fri, Nov 30, 2018, at 22:09, Howard Chu wrote: > Bron Gondwana wrote: > > Hi All, > > > > We were debugging the CPU usage in a ctl_conversationsdb rebuild yesterday, and noticed an interesting thing. 70% of the CPU utilisation for this one process > > was inside the kernel! Mostly with dirty pages. > > > > ctl_conversationsdb -R is particularly heavy on the twoskip database - it's rewriting a lot of random keys. This leads to writes all over the place, as it > > stitches records into the skiplists. > > > > Of course the "real answer"[tm] is zeroskip, which doesn't do random writes - but until then, we suspect that the cost is largely due to the face that we use > > mmap to read, and fwrite to write! We know that might be less efficient already from Linus' comments about 10 years ago! And I guess here's the proof. > > > > An option would be to switch to using mmap to write as well. We could easily modify lib/mappedfile to memcpy to do the writes. > > > > Does anybody see any strong reason not to? > > I've covered the reasons for/against writing thru mmap in my LMDB design papers. I > don't know how relevant all of these are for your use case: > > 1: writing thru mmap loses any control over write ordering - the OS will page dirty pages out in arbitrary order. > If you're using a filesystem that supports ordered writes, it will preserve the ordering of data from write() calls. This is not a concern at all - twoskip is deliberately designed such that it does a single write and then flush to "dirty" the file, all changes made while dirty are fully revertable if it crashes, and then it does a fsync (msync now I guess!) before a single write which clears the dirty flag. So long as a single 256 byte write is consistent, it's safe. > 2: making the mmap writable opens the possibility of undetectable data structure corruption if any other code > is doing stray writes through arbitrary pointers. You need to be very sure your code is bug-free. Yes, this is a significant concern. > 3: if your DB is larger than RAM, writing thru mmap is slower than using write() syscalls. Whenever you > access a page for the first time, the OS will page it in. This is a wasted I/O if all you're doing is > overwriting the page with new data. I doubt it... especially now we're running on servers with 256Gb of data. These databases are usually under a gigabyte in size. I also don't think we ever overwrite a page without reading from it first - we're usually updating pointers which we've just had to read. > 4: you can't use mmap exclusively, if you need to grow the output file. You can only write thru the mapping > to pages that already exist. If you need to grow the file, you must preallocate the space, otherwise you > get a SEGV when referencing unallocated pages. We always know what we're planning to write, so I'm fine with using an ftruncate call on the file descriptor to extend it. > And a side note, multiple studies have shown that skiplists are not cache-friendly, and thus have > inferior performance to B+tree organizations. A skiplist is a very poor choice for a read/write data structure. Yeah, hence zeroskip - it's coming. > Obviously I would recommend you use something carefully designed and heavily tested, like LMDB, instead > of whatever you're using. We tried and had a bad experience last time - it didn't fit in well with the expectations how our code uses database. I'm not super keen to try again right now. I do appreciate your persistence and passion for your project though :) It's good to see this level of engagement. > There's one point in favor of writing thru mmap - if you take care of all the other potential gotchas, > it will work on every OS that implements mmap. Using mmap for reads, and syscalls for writes, is only > valid on OSs with a unified buffer cache. While this isn't a problem on most modern OSs, OpenBSD is a > notable example of an OS that lacks this, and so that approach always results in file corruption there. Yeah - that's an interesting point to me as well. At the moment we use a wrapper which is called map_stupidshared (don't blame me, was named before my time) which unmaps and remaps every time if the file has been changed. Insanity. It gets tested for during the configure stage. We have something even more awful called map_nommmap which just reads the entire file into a buffer every time. As you can imagine, performance is awful - but it does work! Bron. -- Bron Gondwana, CEO, FastMail Pty Ltd brong at fastmailteam.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dilyan.palauzov at aegee.org Fri Nov 30 09:30:44 2018 From: dilyan.palauzov at aegee.org (=?UTF-8?Q?=D0=94=D0=B8=D0=BB=D1=8F=D0=BD_?= =?UTF-8?Q?=D0=9F=D0=B0=D0=BB=D0=B0=D1=83=D0=B7=D0=BE=D0=B2?=) Date: Fri, 30 Nov 2018 14:30:44 +0000 Subject: MMAP performance and using mmap writes In-Reply-To: <1d2f1a25-d31d-4465-8025-5c084b0014fe@sloti7d1t02> References: <4c000ff3-597b-4f62-8ac0-ddfe72c3bd09@sloti7d1t02> <1c46d2af-c16f-aeb3-9e32-291d3ac4e216@highlandsun.com> <1d2f1a25-d31d-4465-8025-5c084b0014fe@sloti7d1t02> Message-ID: <301a947e9606482e3a30a358584d826ff8c2e7ae.camel@aegee.org> Hello, > > 3: if your DB is larger than RAM, writing thru mmap is slower than using write() syscalls. Whenever you > > access a page for the first time, the OS will page it in. This is a wasted I/O if all you're doing is > > overwriting the page with new data. > > I doubt it... especially now we're running on servers with 256Gb of data. These databases are usually under a gigabyte in size. I also don't think we ever overwrite a page without reading from it first - we're usually updating pointers which we've just had to read. > Are there recommendations on RAM for running cyrus imap, that reflect this aspect? Greetings Dilyan From brong at fastmailteam.com Fri Nov 30 23:39:25 2018 From: brong at fastmailteam.com (Bron Gondwana) Date: Fri, 30 Nov 2018 23:39:25 -0500 Subject: MMAP performance and using mmap writes In-Reply-To: <301a947e9606482e3a30a358584d826ff8c2e7ae.camel@aegee.org> References: <4c000ff3-597b-4f62-8ac0-ddfe72c3bd09@sloti7d1t02> <1c46d2af-c16f-aeb3-9e32-291d3ac4e216@highlandsun.com> <1d2f1a25-d31d-4465-8025-5c084b0014fe@sloti7d1t02> <301a947e9606482e3a30a358584d826ff8c2e7ae.camel@aegee.org> Message-ID: On Sat, 1 Dec 2018, at 01:32, ????? ???????? wrote: > Hello, > > > > 3: if your DB is larger than RAM, writing thru mmap is slower than using write() syscalls. Whenever you > > > access a page for the first time, the OS will page it in. This is a wasted I/O if all you're doing is > > > overwriting the page with new data. > > > > I doubt it... especially now we're running on servers with 256Gb of data. These databases are usually under a gigabyte in size. I also don't think we ever overwrite a page without reading from it first - we're usually updating pointers which we've just had to read. > > > > Are there recommendations on RAM for running cyrus imap, that reflect > this aspect? Yes ram is cheap :P Seriously though, we budget for 120 bytes per message in open mailboxes, so if you probably want at least 1 gigabyte ram per 100 users on a machine. If you turn on conversations support (needed for jmap) I would multiply by 4 to allow for the databases there, though jmap doesn't hold as much in ram between connections. Most of the benefit really comes from caching, particularly if you have data on non SSD storage, so the more the better. Bron. > Greetings > Dilyan > > -- Bron Gondwana, CEO, FastMail Pty Ltd brong at fastmailteam.com -------------- next part -------------- An HTML attachment was scrubbed... URL: