<!DOCTYPE html><html><head><title></title><style type="text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div style="font-family:Arial;"><br></div><div style="font-family:Arial;"><br></div><div>On Wed, May 29, 2019, at 06:39, Dilyan Palauzov wrote:<br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">Hello,<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">so the <userid>.conversations database does, apart of the descriptions  <br></div><div style="font-family:Arial;">at  <br></div><div style="font-family:Arial;">https://www.cyrusimap.org/imap/concepts/deployment/databases.html#conversations-userid-conversations, also store per user a G record for each message, mapping the mailboxes where the message is located and the results from Xapian search return G  <br></div><div style="font-family:Arial;">records.<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Are a G record, GUID and a conversation ID the same thing?<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">G records are identical to GUIDs.  There are also G records (in latest master at least) for sub parts of message, which map to a blobId in JMAP and allow direct addressing of every part by a content-based ID.<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">conversation ID is something different, it's based on a permutation of the GUID of the first message that arrived within that thread - and was the original point of the conversations database.<br></div><div style="font-family:Arial;"><br>Sadly this has all evolved over time.  I would like to migrate Cyrus towards using the terminology in JMAP, which has EmailId (which is a prefix on the GUID in JMAP) and ThreadId (which is the conversation ID from Cyrus with 'T' as a prefix).  As well as MailboxId which was previously known in Cyrus as UniqueId on mailboxes.<br></div><div style="font-family:Arial;"><br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">When a message is expunged, are its records from  <br></div><div style="font-family:Arial;"><userid>.conversations removed?<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">They are removed when it is UNLINKED, which may be at the same time depending on your expunge_mode setting.<br></div><div style="font-family:Arial;"><br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">When a message is unexpunged, is it again inserted in  <br></div><div style="font-family:Arial;"><userid>.conversations and referenced in the sync_log_channels:  <br></div><div style="font-family:Arial;">squatter?<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Yes, unexpunge is treated as a new APPEND, and since the bytes are the same, the GUID will be the same.<br></div><div style="font-family:Arial;"><br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">squatter has the modes: indexer, search, rolling, synclog, compact,  <br></div><div style="font-family:Arial;">indexfrom (deprecated) and audit. Is search_batchsize used only in the  <br></div><div style="font-family:Arial;">indexer mode, in particular it is not used when squatter -t … -z -X is  <br></div><div style="font-family:Arial;">called (compact and reindex simultaneously)?<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Hmm.... let me check!  Nope, when you run with -X it reindexes all the messages in an entire mailbox in a single batch, ignoring search_batchsize.<br></div><div style="font-family:Arial;"><br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">What is the application for squatter -X (Reindex all messages before  <br></div><div style="font-family:Arial;">compacting.  This mode reads all the  lists of messages indexed by the  <br></div><div style="font-family:Arial;">listed tiers, and re-indexes them into a temporary database before  <br></div><div style="font-family:Arial;">compacting that into place)?<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">It is very useful when index formats have changed over time and you want to reindex all emails with the latest format, or when you believe a search database might be corrupted and want to rebuild it from source.<br></div><div style="font-family:Arial;"><br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">Does it index messages, that were not indexed yet for any reason, or  <br></div><div style="font-family:Arial;">it deletes the database, scans each message again and creates a  <br></div><div style="font-family:Arial;">compact Xapian database?<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">It uses the cyrus.indexed.db of each of the source databases (selected by -t) to know which range of UIDs in each mailbox were claimed to be indexed by those databases, and then scans over those same ranges of UIDs again and indexes the contents of those messages if they are not yet expunged.<br></div><div style="font-family:Arial;"><br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">In the case I described, mailbox receiving reports, having an index  <br></div><div style="font-family:Arial;">grow very fast, the cause was a mail loop - a lot of emails arriving  <br></div><div style="font-family:Arial;">in short time.  Once the loop stopped, the index does not exand faster  <br></div><div style="font-family:Arial;">than other mailboxes.<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">That makes sense.<br></div><div style="font-family:Arial;"><br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">So by default for now, unless some extra setup is performed, only  <br></div><div style="font-family:Arial;">words in text/plain and text/html get indexed, possibly with headers,  <br></div><div style="font-family:Arial;">and attachments are ignored?<br></div></blockquote><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Yes, that's is correct.  In fact, it's all text types.  text/calendar and text/vcard are processed specially.  Other text/* types are treated the same as text/plain for indexing purposes.<br></div><div style="font-family:Arial;"><br></div><div style="font-family:Arial;">Bron.<br></div><div id="sig56629417"><div class="signature">--<br></div><div class="signature">  Bron Gondwana, CEO, FastMail Pty Ltd<br></div><div class="signature">  brong@fastmailteam.com<br></div><div class="signature"><br></div></div><div style="font-family:Arial;"><br></div></body></html>