<!DOCTYPE html><html><head><title></title><style type="text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div style="font-family:Arial;">I saw your ticket about that - I'll check it out soon. Sorry, been busy at the calconnect conference in the UK this week.<br></div><div style="font-family:Arial;"><br>I also did this:</div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><a href="https://github.com/cyrusimap/cyrus-imapd/commit/27513a9bc3f217f388bac163820f9879178071fb">https://github.com/cyrusimap/cyrus-imapd/commit/27513a9bc3f217f388bac163820f9879178071fb</a><br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Which I believe will mean that if you change the defaultsearchtier, it will immediately start indexing to the new location. You'll definitely want to restart a server over changing that config option though, and not have it be different between different invocations of squatter, or you'll wind up creating a lot of xapainactive entries!<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Bron.<br></div><div style="font-family:Arial;"><br></div><div>On Tue, Jun 4, 2019, at 04:19, Dilyan Palauzov wrote:<br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">Hello Bron,<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">imap/squatter.c:do_compact() does call `if (sleepmicroseconds) <br></div><div style="font-family:Arial;">usleep(sleepmicroseconds);` so -S number is honoured with `squatter <br></div><div style="font-family:Arial;">-t… -z…`.<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Will `squatter -F -z… -t…` be fixed on the stable branch, or shall <br></div><div style="font-family:Arial;">calling `squatter -F -t… -z` be discouraged with 3.0?<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Providing that currently after `squatter -F -t… -z…` calling `squatter <br></div><div style="font-family:Arial;">-t… -z` does reindex all messages and therefore creates a new xapian <br></div><div style="font-family:Arial;">index, it must be possible to to create a compacted database directly, <br></div><div style="font-family:Arial;">without creating an bloated index first.<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">My understaning to the rolling mode is that once a new message <br></div><div style="font-family:Arial;">appears/arrives/is APPENDed or deliver(ed), it is added to the sync <br></div><div style="font-family:Arial;">log and then indexed in rolling mode. Then arrives a message at a <br></div><div style="font-family:Arial;">different place, it is added to the log and then indexed. Whether the <br></div><div style="font-family:Arial;">first and second messages are in the same mailbox is completely <br></div><div style="font-family:Arial;">random. Why does squatter not sleep, if the two messages are in the <br></div><div style="font-family:Arial;">same mailbox and works non-stop otherwise, say why does it sleep <br></div><div style="font-family:Arial;">depending on random circumstances?<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">https://wiki.dovecot.org/Plugins/FTS/Squat says for DoveCot that IMAP <br></div><div style="font-family:Arial;">requires that SEARCH is done also on substings, no IMAP server <br></div><div style="font-family:Arial;">implements this requirement, and dovecot does implement it only when <br></div><div style="font-family:Arial;">Squat indices are used. Is the same valid for Cyrus Imap (Squat index <br></div><div style="font-family:Arial;">offers substring search, Xapian index does not offer substring search)?<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Runnig squatter once printed “compressing X:0,X,Y:0 to Y:3 for … <br></div><div style="font-family:Arial;">(active Y:0,X:0,X,Y:0,Y:1,Y:2)” <br></div><div style="font-family:Arial;">(https://github.com/cyrusimap/cyrus-imapd/issues/2764) so I suspect a <br></div><div style="font-family:Arial;">tiername without a number was in the .xapianactive file.<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">If I do any compact (-o, -F, -X, just -t -z), where the first tier is <br></div><div style="font-family:Arial;">not referenced, does squatter ensure that the default tier according <br></div><div style="font-family:Arial;">to imapd.conf is inserted in the xapianactive file. Or asking in <br></div><div style="font-family:Arial;">other ways, it I change imapd.conf and create a new tier T6 and <br></div><div style="font-family:Arial;">declare T5 to be the default tier, which of the following will insert <br></div><div style="font-family:Arial;">a reference to T5:0 in .xapianactive and which will not:<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">squatter -t T2 -o -z T2<br></div><div style="font-family:Arial;">squatter -t T5,T2 -z T2<br></div><div style="font-family:Arial;">squatter -t T5 -o T4<br></div><div style="font-family:Arial;">squatter -t T2 -F T3<br></div><div style="font-family:Arial;">sqautter -t T2 -X T3<br></div><div style="font-family:Arial;">or what else? (The name T5 is declared, and the root directory exist, <br></div><div style="font-family:Arial;">but neither there is data in the directory, nor is T5 yet in any <br></div><div style="font-family:Arial;">.xapianactive file).<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Regards<br></div><div style="font-family:Arial;"> Дилян<br></div><div style="font-family:Arial;">----- Message from Bron Gondwana <brong@fastmailteam.com> ---------<br></div><div style="font-family:Arial;"> Date: Tue, 04 Jun 2019 01:53:23 +1000<br></div><div style="font-family:Arial;"> From: Bron Gondwana <brong@fastmailteam.com><br></div><div style="font-family:Arial;">Subject: Re: squatter -F increases the index size<br></div><div style="font-family:Arial;"> To: Cyrus Devel <cyrus-devel@lists.andrew.cmu.edu><br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">> On Sat, Jun 1, 2019, at 04:34, Dilyan Palauzov wrote:<br></div><div style="font-family:Arial;">>> Hello,<br></div><div style="font-family:Arial;">>><br></div><div style="font-family:Arial;">>> I gave squatter -F a try.<br></div><div style="font-family:Arial;">>><br></div><div style="font-family:Arial;">>> Before I run it for a user tier T1 was not compacted and allocated 3,4<br></div><div style="font-family:Arial;">>> MB (mega), T2 was compacted and contained 3.7GB (giga). After<br></div><div style="font-family:Arial;">>> removing the records of the deteled messages, say running squatter -F<br></div><div style="font-family:Arial;">>> T2 was 5.7GB and squatter printed “filtering” instead of “compacting”.<br></div><div style="font-family:Arial;">>> Then I run again “squatter -t T1,T2 -z T2” without -F, without -X<br></div><div style="font-family:Arial;">>> and squatter reindexed all messages, to create a 3.0 GB index.<br></div><div style="font-family:Arial;">>><br></div><div style="font-family:Arial;">>> I expected, that using -F the resulting database will be compacted and<br></div><div style="font-family:Arial;">>> on the second call there will be no reindexing.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> I discovered some bad bugs in -F recently, so I suspect that's why. <br></div><div style="font-family:Arial;">> They should be fixed on master now.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> When does squatter decide on its own to reindex?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> When the DB version is too old (which is one of the -F bugs - it <br></div><div style="font-family:Arial;">> wasn't setting the DB version, so it assumed the data was all <br></div><div style="font-family:Arial;">> version zero!)<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> What do G records in conversations.db contain?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> G records contain a mapping from GUID to folder number (offset into <br></div><div style="font-family:Arial;">> the $FOLDER_NAMES key) and UID and optionally IMAP part number as <br></div><div style="font-family:Arial;">> the key - mapping to values which contain some keywords and modseq <br></div><div style="font-family:Arial;">> from the original record as well.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> My reading is that the way to create a Xapian index of an indexed<br></div><div style="font-family:Arial;">>> mailbox, is that first squatter has to be run in INDEX mode and then<br></div><div style="font-family:Arial;">>> in COMPACT mode. In particular it is not possible to create in one<br></div><div style="font-family:Arial;">>> step a compacted database.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> No, it's not - due to the way to compact API works. At least, I <br></div><div style="font-family:Arial;">> haven't figured out how.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> Does squatter -R -S sleep after each mailbox or after each message indexed?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> It sleeps after each mailbox.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> When compacting, squatter deals just with messages and on search or<br></div><div style="font-family:Arial;">>> reindex the conversations.db is used to map the messages to mailboxes.<br></div><div style="font-family:Arial;">>> How does squatter -S sleep after each mailbox during compacting, if<br></div><div style="font-family:Arial;">>> it knows nothing about mailboxes?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> -S is not used when compacting.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> What does mean a tier name in a xapianactive file without a number?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> that shouldn't happen. It will be parsed as the same as tier:0 I believe.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> What are XAPIAN_DBW_CONVINDEXED and _XAPINDEXED?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> Two different ways to know if a document is indexed. CONVINDEXED <br></div><div style="font-family:Arial;">> uses the conversations DB to look up mailbox and uid and then the <br></div><div style="font-family:Arial;">> cyrus.indexed.db databases to see if the message has already been <br></div><div style="font-family:Arial;">> seen.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> XAPINDEXED uses the metadata inside the Xapian databases to know if <br></div><div style="font-family:Arial;">> a particular message has been indexed based on the cyrusid.*G* <br></div><div style="font-family:Arial;">> metadata values which are identical to the GUIDs themselves.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> What does the file sync/squatter?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> It's a sync/$channel directory which squatter watches on. This is a <br></div><div style="font-family:Arial;">> method for providing a queue of mailboxes to look at based on the <br></div><div style="font-family:Arial;">> APPEND sync_log statements.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> squatter can print “Xapian: truncating text from message mailbox<br></div><div style="font-family:Arial;">>> user.... uid 7309”. When are messages truncated for the purposes of<br></div><div style="font-family:Arial;">>> indexing?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> When they are too long! The comment in the source code says this:<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> /* Maximum size of a query, determined empirically, is a little bit<br></div><div style="font-family:Arial;">> * under 8MB. That seems like more than enough, so let's limit the<br></div><div style="font-family:Arial;">> * total amount of parts text to 4 MB. */<br></div><div style="font-family:Arial;">> #define MAX_PARTS_SIZE (4*1024*1024)<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> This is a holdover from when Greg was working on it. We could switch <br></div><div style="font-family:Arial;">> this to be a configurable option.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">>> Do I understand correctly, that for a Xapianactive file with "A B C D<br></div><div style="font-family:Arial;">>> E", to remove C one has to call "squatter -t C,D -z D". But A cannot<br></div><div style="font-family:Arial;">>> be removed, if it the defaultsearchtier. Is the defaultsearchtier<br></div><div style="font-family:Arial;">>> always included in the xapianactive file, if the tier is missing,<br></div><div style="font-family:Arial;">>> whenever the file is modified (and the only way to modify it is to<br></div><div style="font-family:Arial;">>> call squatter in COMPACT mode)?<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> When you do any compact, if it includes the first item (the writable <br></div><div style="font-family:Arial;">> database) then a new writable database will be created on the <br></div><div style="font-family:Arial;">> default tier. So if you try to compact the default tier away, a new <br></div><div style="font-family:Arial;">> default tier item will be created.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> Bron.<br></div><div style="font-family:Arial;">><br></div><div style="font-family:Arial;">> --<br></div><div style="font-family:Arial;">> Bron Gondwana, CEO, FastMail Pty Ltd<br></div><div style="font-family:Arial;">> brong@fastmailteam.com<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">----- End message from Bron Gondwana <brong@fastmailteam.com> -----<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><br></div></blockquote><div style="font-family:Arial;"><br></div><div id="sig56629417"><div class="signature">--<br></div><div class="signature"> Bron Gondwana, CEO, FastMail Pty Ltd<br></div><div class="signature"> brong@fastmailteam.com<br></div><div class="signature"><br></div></div><div style="font-family:Arial;"><br></div></body></html>